* initial work on l1 * l1 int8 neon implementation * tweak l1 int8 and add test * broken overflow still * some progress on l1 * change to i32 instead of i64 * remove comment * ignore poetry stuff * unrolled l1 int8 and format * remove comments