Repo Context for AI Agents (PTO Tile Lib)【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isaThis document is a fast, practical orientation for agents working in this repo: what it is, where the key entrypoints live, and the shortest paths to build/run inCPU,NPU simulator (sim), andon-board NPU (npu)modes.What This Repo IsPTO Tile Library: C headers implementations for the PTO (Parallel Tile Operation) virtual ISA defined by Ascend CANN.Supports multiple backends:CPU simulation(cross-platform, no Ascend driver/CANN required).Ascend NPUbackends split by SoC generation:A2/A3 family:include/pto/npu/a2a3/(selected via-v a3in test scripts).A5:include/pto/npu/a5/.Primary include for upper-layer code:#include pto/pto-inst.hpp(unified entry header).Repo Map (Where To Look First)Project overview common commands:README.mdDetailed setup (CPU first, then NPU):docs/getting-started.mdISA docs and navigation:docs/README.md(ISA guide entry)docs/isa/(per-instruction reference)Public API headers and backend status table:include/README.mdCore public headers / backend split:include/pto/README.mdBuild/package entrypoint:build.sh, top-levelCMakeLists.txt,cmake/Tests entrypoints:CPU simulator tests:tests/run_cpu.py,tests/run_cpu_tests.shNPU ST build/run:tests/script/run_st.py,tests/run_st.shTest layout overview:tests/README.mdDemos:demos/(CPU demos used bytests/run_cpu.py --demo ...)Run: CPU Simulator (Recommended First)CPU simulation is meant to be the “works everywhere” correctness path.From repo root:python3 tests/run_cpu.py --clean --verboseUseful variants:python3 tests/run_cpu.py --testcase tadd python3 tests/run_cpu.py --testcase tadd --gtest_filter TADDTest.* python3 tests/run_cpu.py --demo gemm --verbose python3 tests/run_cpu.py --demo flash_attn --verboseNotes:CPU ST uses CMake and GoogleTest; it may download GTest if not installed system-wide.Compiler requirement is at leastC20(seetests/cpu/st/CMakeLists.txt).For enabling bfloat16 support in CPU-SIM, GCC14 is requiredRun: NPU ST (Ascend) —simandnpuNPU ST is built/run viatests/script/run_st.py:python3 tests/script/run_st.py -r [sim|npu] -v [a3|a5] [-a] -t testcase -g gtest_filterKey points:-acompiles the test case in auto mode instead of manual mode.-v a3selects theA2/A3implementation underinclude/pto/npu/a2a3/(the test script maps it to a SoC string likeAscend910B1).-r simuses the Ascend simulator libraries under$ASCEND_HOME_PATH/tools/simulator/SOC/libandruntime/lib64/stub.-r npuruns on real hardware.Examples (single case):python3 tests/script/run_st.py -r sim -v a3 -t tadd -g TADDTest.case_float_64x64_64x64 python3 tests/script/run_st.py -r npu -v a3 -t tadd -g TADDTest.case_float_64x64_64x64Recommended suites (wrapper script):chmod x ./tests/run_st.sh ./tests/run_st.sh a3 sim simple ./tests/run_st.sh a3 npu simpleEnvironment: Ascend CANN / ToolkitNPU ST requires a working Ascend environment. Typical setup (choose the correct install path):source /usr/local/Ascend/cann/bin/setenv.bash # or source $HOME/Ascend/ascend-toolkit/latest/bin/setenv.bashtests/script/run_st.pyexpectsASCEND_HOME_PATHto be set (usually done bysetenv.bash).Common Pitfalls (And How This Repo Handles Them)GTest ABI mismatch on Linux: some systems havelibgtest*.abuilt with_GLIBCXX_USE_CXX11_ABI0.CPU and NPU ST CMake projects supportPTO_GLIBCXX_USE_CXX11_ABIauto|0|1and auto-detect when possible.simopen-files limit: simulator runs may require a higherulimit -n(seedocs/getting-started.mdandbuild.sh).【免费下载链接】pto-isaParallel Tile Operation (PTO) is a virtual instruction set architecture designed by Ascend CANN, focusing on tile-level operations. This repository offers high-performance, cross-platform tile operations across Ascend platforms.项目地址: https://gitcode.com/cann/pto-isa创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考