fix: harden CUDA safety checks and translate comments to English

Safety fixes (4 critical, 4 warning) from code review:

- qap.cuh: fix clone_to_device cross-device D2H by retaining host matrices
- types.cuh: add CUDA_CHECK to InjectBuffer, track owner_gpu for safe destroy
- types.cuh: add bounds check on lexicographic priority index
- solver.cuh: cap migrate_kernel islands to MAX_ISLANDS=64 to prevent stack overflow
- multi_gpu_solver.cuh: guard against 0 GPUs, propagate stop_reason from best GPU
- types.cuh: warn on SeqRegistry overflow
- solver.cuh: warn when constraint_directed/phased_search disabled without AOS

Translate all Chinese comments to English across 25+ source files
(core/*.cuh, problems/*.cuh, Makefile, multi-GPU tests).

Verified on V100S×2 (sm_70, CUDA 12.8): e5 (12 problem types, all optimal),
e13 (multi-objective + multi-GPU, 9 configs, all passed).
This commit is contained in:
L-yang-yang 2026-03-25 11:52:50 +08:00
parent ab278d0e82
commit a848730459
25 changed files with 1147 additions and 1167 deletions

View file

@ -1,10 +1,10 @@
# GenSolver Makefile
#
# 用法:
# make e1 e2 e3 e4 e5 e6 → 编译单个实验
# make diag → 编译诊断程序
# make all → 编译全部
# make clean → 清理
# Usage:
# make e1 e2 e3 e4 e5 e6 → Build individual experiments
# make diag → Build diagnostic program
# make all → Build all
# make clean → Clean
NVCC = nvcc
ARCH ?= -arch=sm_75
@ -40,10 +40,10 @@ $(EXP_DIR)/%/gpu: $(EXP_DIR)/%/gpu.cu $(ALL_HEADERS) problems/tsplib_data.h
$(EXP_DIR)/e0_diagnosis/bench_diagnosis: $(EXP_DIR)/e0_diagnosis/bench_diagnosis.cu $(ALL_HEADERS)
$(NVCC) $(ARCH) $(CFLAGS) $(INCLUDES) -o $@ $<
test_multi_gpu: test_multi_gpu.cu $(ALL_HEADERS)
test_multi_gpu: $(EXP_DIR)/e9_multi_gpu_b3/test_multi_gpu.cu $(ALL_HEADERS)
$(NVCC) $(ARCH) $(CFLAGS) $(INCLUDES) -o $@ $<
test_multi_gpu_b3: test_multi_gpu_b3.cu $(ALL_HEADERS)
test_multi_gpu_b3: $(EXP_DIR)/e9_multi_gpu_b3/test_multi_gpu_b3.cu $(ALL_HEADERS)
$(NVCC) $(ARCH) $(CFLAGS) $(INCLUDES) -o $@ $<
clean: