| COMMIT |
0.20 |
[fix] harmonize template |
|
Uses technical term 'harmonize' but in b |
2025-11-06 |
| COMMIT |
0.15 |
fix: attn_forwad when is_causal=True assert attn_mask is Non |
|
Slightly more descriptive but remains te |
2025-11-18 |
| COMMIT |
0.10 |
[update] params log |
|
Bare-bones commit messages typical of hu |
2026-01-07 |
| COMMIT |
0.10 |
[update] mask log |
|
Bare-bones commit messages typical of hu |
2026-01-07 |
| COMMIT |
0.10 |
[update] readme |
|
Bare-bones commit messages typical of hu |
2026-01-06 |
| COMMIT |
0.10 |
[update] simplify loader |
|
Bare-bones commit messages typical of hu |
2026-01-05 |
| COMMIT |
0.10 |
[update] readme |
|
Bare-bones commit messages typical of hu |
2026-01-05 |
| COMMIT |
0.10 |
[update] rename train tokenizer |
|
Bare-bones commit messages typical of hu |
2026-01-05 |
| COMMIT |
0.10 |
[update] readme |
|
Bare-bones commit messages typical of hu |
2026-01-05 |
| COMMIT |
0.10 |
[fix] messages num |
|
Bare-bones commit messages typical of hu |
2026-01-04 |
| COMMIT |
0.10 |
[fix] dist cleanup |
|
Bare-bones commit messages typical of hu |
2026-01-02 |
| COMMIT |
0.10 |
[feat] update yarn |
|
Extremely terse commit-style messages wi |
2025-12-01 |
| COMMIT |
0.10 |
[feat] release memory |
|
Very brief technical phrasing typical of |
2025-11-27 |
| COMMIT |
0.10 |
[fix] ppo mask |
|
Minimal message uses domain abbreviation |
2025-11-19 |
| COMMIT |
0.10 |
[fix] model attn_mask |
|
Concise technical fix reference with dom |
2025-11-19 |
| COMMIT |
0.10 |
[fix] update model |
|
Simple two-word technical instruction wi |
2025-11-18 |
| COMMIT |
0.10 |
[fix] prompt length calculate |
|
Brief domain-specific fix reference with |
2025-11-15 |
| COMMIT |
0.10 |
[fix] model-name |
|
Minimal hyphenated fix reference typical |
2025-11-07 |
| COMMIT |
0.10 |
[feat] update requirements |
|
Extremely terse technical commit message |
2025-10-23 |
| COMMIT |
0.10 |
[feat] update readme |
|
Minimal commit message with specific tec |
2025-10-23 |
| COMMIT |
0.10 |
[feat] update readme |
|
Brief, repetitive commit message typical |
2025-10-23 |
| COMMIT |
0.10 |
[feat] repetition-penalty |
|
Specialized ML term with no polite or fo |
2025-10-23 |
| COMMIT |
0.10 |
[feat] convert2llama |
|
Technical shorthand conversion label wit |
2025-10-23 |
| COMMIT |
0.10 |
[feat] shuffle data |
|
Concise data operation command, not AI-s |
2025-10-23 |
| COMMIT |
0.10 |
[fix] loss-issues-430 |
|
Issue-specific fix reference with techni |
2025-10-23 |
| COMMIT |
0.10 |
[fix] restore |
|
Single-word commit message - too minimal |
2025-10-23 |
| COMMIT |
0.10 |
[fix] issue-431 |
|
Issue number reference in terse human-st |
2025-10-23 |
| COMMIT |
0.10 |
[fix] sampler-ddp |
|
Technical DDP sampler fix with domain-sp |
2025-10-23 |
| COMMIT |
0.00 |
[update] random seed |
|
Very terse, standard commit message styl |
2026-03-27 |
| COMMIT |
0.00 |
[update] fp16 inference |
|
Brief, domain-specific commit message. |
2026-03-27 |
| COMMIT |
0.00 |
[update] image |
|
Terse, mechanical commit; typical human |
2026-03-26 |
| COMMIT |
0.00 |
[update] change default seq_len |
|
Concise and technical; common human comm |
2026-03-26 |
| COMMIT |
0.00 |
[update] minimind-3 |
|
Very brief, informal; no AI hallmarks pr |
2026-03-24 |
| COMMIT |
0.00 |
[fix] align log/save last-step check and ETA with 1-indexed |
|
Specific technical fix; human-like jargo |
2026-03-24 |
| COMMIT |
0.00 |
[fix] gradient accumulation step alignment |
|
Direct technical fix; human engineering |
2026-03-24 |
| COMMIT |
0.00 |
[update] empty_think_ratio |
|
Minimal, to-the-point; lacks AI politene |
2026-02-06 |
| COMMIT |
0.00 |
[update] empty_think_ratio |
|
Repetitive, brief update; likely human c |
2026-02-05 |
| COMMIT |
0.00 |
[feat] data process |
|
Ambiguous but informal; typical human sh |
2026-02-05 |
| COMMIT |
0.00 |
[update] save interval |
|
Concise and clear; common human commit m |
2026-01-30 |
| COMMIT |
0.00 |
[update] safe half |
|
Very brief, technical term; no AI stylis |
2026-01-30 |
| COMMIT |
0.00 |
[fix] data skip |
|
Extremely terse and informal phrasing ty |
2026-01-18 |
| COMMIT |
0.00 |
[update] shuffle data |
|
Brief, informal update note with common |
2026-01-18 |
| COMMIT |
0.00 |
[fix] max length |
|
Minimalist technical fix; lacks any poli |
2026-01-17 |
| COMMIT |
0.00 |
[update] pretrain load |
|
Concise, technical update note with stan |
2026-01-17 |
| COMMIT |
0.00 |
[update] align mask |
|
Very short and specific; common ML/engin |
2026-01-15 |
| COMMIT |
0.00 |
[update] align loss |
|
Terse technical update; no hallmark AI p |
2026-01-14 |
| COMMIT |
0.00 |
[fix] compile unpack |
|
Brief human-style fix message ('fix comp |
2026-01-14 |
| COMMIT |
0.00 |
[feat] add compile |
|
Succinct feature addition; common Git co |
2026-01-14 |
| COMMIT |
0.00 |
[update] prompt prefill |
|
Direct technical update; uses specific j |
2026-01-13 |
| COMMIT |
0.00 |
[update] show speed |
|
Short, informal update typical of human |
2026-01-07 |
| COMMIT |
0.00 |
[update] rename reason |
|
Bare-bones commit messages typical of hu |
2026-01-05 |
| COMMIT |
0.00 |
[update] aux loss |
|
Terse commit-style messages with technic |
2026-01-01 |
| COMMIT |
0.00 |
[fix] experts unused |
|
Concise fix notation typical of develope |
2025-12-31 |
| COMMIT |
0.00 |
[fix] layers set 8 |
|
Very brief technical notation lacking AI |
2025-12-31 |
| COMMIT |
0.00 |
[fix] moe unused |
|
Short technical fix message with domain- |
2025-12-31 |
| COMMIT |
0.00 |
[feat] get params |
|
Simple feature description with no AI-st |
2025-12-31 |
| COMMIT |
0.00 |
[feat] get params |
|
Identical to previous item; typical huma |
2025-12-31 |
| COMMIT |
0.00 |
[feat] update config |
|
Brief config update notation without AI |
2025-12-31 |
| COMMIT |
0.00 |
[feat] update lr |
|
Abbreviated technical term (lr) with no |
2025-12-31 |
| COMMIT |
0.00 |
[feat] compatible tokenizer |
|
Concise feature description using domain |
2025-12-31 |
| COMMIT |
0.00 |
[feat] stream load data |
|
Terse technical notation typical of deve |
2025-12-28 |
| COMMIT |
0.00 |
[feat] remove empty_cache |
|
Terse, technical message with typical Gi |
2025-12-26 |
| COMMIT |
0.00 |
[feat] explicit left padding |
|
Concise, technical message with standard |
2025-12-23 |
| COMMIT |
0.00 |
[fix] lora weight |
|
Minimal, direct technical fix descriptio |
2025-12-22 |
| COMMIT |
0.00 |
Fix: support loading DDP-saved LoRA weights for inference |
|
Direct technical description with specif |
2025-12-22 |
| COMMIT |
0.00 |
[feat] adjust seq length |
|
Very brief, technical update typical of |
2025-12-14 |
| COMMIT |
0.00 |
[feat] update readme |
|
Very brief update message, common for Gi |
2025-12-11 |
| COMMIT |
0.00 |
[fix] dtype & lr |
|
Extremely concise technical fix with com |
2025-12-09 |
| COMMIT |
0.00 |
[fix] Refactor get_lr function to include min_lr calculation |
|
Detailed, specific technical explanation |
2025-12-06 |
| COMMIT |
0.00 |
[fix] reduce aux_loss_alpha |
|
Minimal technical message about a parame |
2025-12-05 |
| COMMIT |
0.00 |
[fix] cuda memory #559 |
|
Terse, context-specific technical messag |
2025-12-01 |
| COMMIT |
0.00 |
[feat] add MNN support to README. |
|
Specific feature addition but remains co |
2025-11-10 |
| COMMIT |
0.00 |
[feat] clear cache |
|
Terse, informal commit style typical of |
2025-11-06 |
| COMMIT |
0.00 |
[fix] harmonize template |
|
Concise technical wording, lacks AI form |
2025-11-02 |
| COMMIT |
0.00 |
[feat] update import |
|
Brief and direct, common in human commit |
2025-10-31 |
| COMMIT |
0.00 |
[feat] update readme |
|
Minimal update note, no AI stylistic mar |
2025-10-30 |
| COMMIT |
0.00 |
[feat] update readme |
|
Repetitive simple update, typical human |
2025-10-30 |
| COMMIT |
0.00 |
[feat] update readme |
|
Same as previous, no signs of AI generat |
2025-10-30 |
| COMMIT |
0.00 |
[feat] update datasets |
|
Direct technical term, no AI phrasing. |
2025-10-30 |
| COMMIT |
0.00 |
[feat] update args |
|
Short, informal, lacks any AI hallmarks. |
2025-10-30 |
| COMMIT |
0.00 |
[feat] add args |
|
Minimalist and terse, characteristic of |
2025-10-30 |
| COMMIT |
0.00 |
[feat] update readme |
|
Identical to other updates, clearly huma |
2025-10-29 |
| COMMIT |
0.00 |
[fix] model device |
|
Very terse, technical commit format with |
2025-10-29 |
| COMMIT |
0.00 |
[feat] update trainer |
|
Minimal, repetitive commit title typical |
2025-10-28 |
| COMMIT |
0.00 |
[feat] update trainer |
|
Identical to previous; likely batch huma |
2025-10-28 |
| COMMIT |
0.00 |
[feat] update readme |
|
Brief, standard commit to update documen |
2025-10-27 |
| COMMIT |
0.00 |
[feat] update readme |
|
Repetitive format, no AI stylistic phras |
2025-10-26 |
| COMMIT |
0.00 |
[feat] update eval-llm |
|
Abbreviated, informal module name like ' |
2025-10-26 |
| COMMIT |
0.00 |
[feat] pause-training |
|
Hyphenated, descriptive feature name com |
2025-10-26 |
| COMMIT |
0.00 |
[feat] update readme |
|
Repetitive, minimal content typical of m |
2025-10-23 |
| COMMIT |
0.00 |
[feat] update readme |
|
Identical pattern; no AI hallmarks like |
2025-10-23 |
| COMMIT |
0.00 |
[feat] update requirements |
|
Terse, technical commit typical of depen |
2025-10-23 |
| COMMIT |
0.00 |
[fix] graph-oom & ddp-pos_cis |
|
Extremely terse and informal, typical of |
2025-10-23 |
| COMMIT |
0.00 |
[fix] git track |
|
Short, minimal text with informal phrasi |
2025-10-21 |
| COMMIT |
0.00 |
[feat] update readme |
|
Brief and typical human update to docume |
2025-10-21 |
| COMMIT |
0.00 |
[feat] minimind-2510 |
|
Concise, likely a project-specific refer |
2025-10-21 |
| COMMIT |
0.00 |
[feat] update eval |
|
Very terse; lacks any AI-style politenes |
2025-10-17 |
| COMMIT |
0.00 |
[feat] update requirements |
|
Minimal description, common for dependen |
2025-10-16 |
| COMMIT |
0.00 |
[fix] update model |
|
Too brief and direct to be AI-generated. |
2025-10-16 |
| COMMIT |
0.00 |
[fix] update model |
|
Identical to previous; no AI hallmarks. |
2025-10-16 |
| PR |
0.00 |
使用einops进一步提升代码可读性 |
|
Chinese text, concise and topic-specific |
2026-03-27 |
| PR |
0.00 |
[feat] add dapo algorithm |
|
Detailed technical phrasing, human tone, |
2026-03-27 |
| PR |
0.00 |
merge redundant forward passes for logps and aux_loss (in tr |
|
— |
2026-03-24 |
| PR |
0.00 |
添加数据集加载逻辑、网页内容抓取与数据预处理逻辑 |
|
— |
2025-12-16 |
| PR |
0.00 |
[docs] Fix wording in RLHF section of README.md file |
|
— |
2026-01-27 |
| PR |
0.00 |
[docs]: clarify pretraining data format in README |
|
— |
2025-12-31 |
| PR |
0.00 |
refactor: optimize tensor wrapping in lm_dataset.py |
|
— |
2025-12-20 |
| PR |
0.00 |
[fix] 修复训练脚本中 1-indexed step 与 0-indexed 逻辑混用的问题 |
|
— |
2026-03-24 |
| PR |
0.00 |
Fix SFT resume with torch.compile enabledfix sft resume with |
|
— |
2026-03-19 |
| PR |
0.00 |
新建test分支 |
|
— |
2026-03-22 |
| PR |
0.00 |
[mod & add] fix spo algorithm, add dapo and cispo algorithm |
|
— |
2026-01-30 |
| PR |
0.00 |
Mega |
|
— |
2025-06-26 |
| PR |
0.00 |
Add dynamic growth pipeline, eval tooling, and overnight run |
|
— |
2026-02-22 |
| PR |
0.00 |
更新了 model / trainer 中的注释 & PyTorch 新版本 Automatic Mixed Preci |
|
— |
2026-02-04 |
| PR |
0.00 |
Update requirements.txt |
|
— |
2024-10-28 |
| PR |
0.00 |
Auto tokenizer name path fix |
|
— |
2024-10-03 |
| PR |
0.00 |
Update requirements |
|
— |
2024-10-01 |
| PR |
0.00 |
[add] add gating term on po algorithm |
|
— |
2026-02-03 |
| PR |
0.00 |
add muon optimizer |
|
— |
2026-02-02 |
| PR |
0.00 |
modified: .gitignore |
|
— |
2026-01-19 |
| PR |
0.00 |
Fix DPO loss_mask boundary (include first assistant token) |
|
— |
2026-01-07 |
| PR |
0.00 |
[feat] Support Minimind retrieval-augmented generation (RAG) |
|
— |
2025-11-16 |
| PR |
0.00 |
perf: merge LoRA weights into model for inference |
|
— |
2025-12-23 |
| PR |
0.00 |
Create RESOURCES.md for MiniMind project |
|
— |
2025-10-24 |
| PR |
0.00 |
Perf/merge lora weights |
|
— |
2025-12-23 |
| PR |
0.00 |
Fix: support loading DDP-saved LoRA weights for inference |
|
— |
2025-12-22 |
| PR |
0.00 |
[feat] add interactive notebook |
|
— |
2025-02-23 |
| PR |
0.00 |
[feat] Add Training Web UI |
|
— |
2025-11-06 |
| PR |
0.00 |
fix: 调整model_lora.py里面lora作用的对象 |
|
— |
2025-12-05 |
| PR |
0.00 |
Add attention gate |
|
— |
2025-12-12 |
| PR |
0.00 |
feat: 增加 LoRA alpha 缩放系数及命令行支持 |
|
— |
2025-12-12 |
| PR |
0.00 |
[fix] Refactor get_lr function to include min_lr calculation |
|
— |
2025-12-06 |
| PR |
0.00 |
feat: add merge_lora.py to support merging LoRA weights into |
|
— |
2025-12-05 |
| PR |
0.00 |
第一次尝试 |
|
— |
2025-11-21 |
| PR |
0.00 |
[Security] Fix HIGH vulnerability: trailofbits.python.pickle |
|
— |
2025-11-19 |
| PR |
0.00 |
fix: attn_forwad when is_causal=True assert attn_mask is Non |
|
— |
2025-11-18 |
| PR |
0.00 |
Train_Grpo 添加注释 |
|
— |
2025-10-26 |
| PR |
0.00 |
[feat] update model install method |
|
— |
2025-11-05 |
| PR |
0.00 |
[feat] add MNN support to README. |
|
— |
2025-11-10 |
| PR |
0.00 |
fix: Loading LoRA parameters which saved from multi-card tra |
|
— |
2025-11-06 |
| PR |
0.00 |
Update eval_llm.py |
|
— |
2025-11-03 |
| PR |
0.00 |
Merge pull request #1 from jingyaogong/master |
|
— |
2025-10-29 |
| PR |
0.00 |
Hope for Integrate swanlab希望集成SwanLab实验跟踪工具 |
|
— |
2025-02-28 |
| PR |
0.00 |
新增注释,解释 Attention Trainer 细节 |
|
— |
2025-08-15 |
| PR |
0.00 |
取消模型上下文限制,增加模型动态长度扩展机制,并保持前向兼容性 |
|
— |
2025-07-09 |
| PR |
0.00 |
增加可选的MLA支持、修复模型内部精度一致,优化代码add mla, fix model dtype, improve |
|
— |
2025-02-28 |
| PR |
0.00 |
Improve training performance with torch.compile and torch.am |
|
— |
2025-05-21 |
| PR |
0.00 |
升腾NPU适配 |
|
— |
2025-05-16 |
| PR |
0.00 |
完善注释及训练脚本 |
|
— |
2025-05-08 |
| PR |
0.00 |
修改 serve_openai_api.py 的默认参数 |
|
— |
2025-04-30 |
| PR |
0.00 |
Hotfix/issues 382 |
|
— |
2025-04-29 |
| PR |
0.00 |
Update eval_model.py |
|
— |
2025-04-26 |
| PR |
0.00 |
sft should use pretrain model |
|
— |
2025-04-26 |
| PR |
0.00 |
完善 README 中关于加载已有模型的说明 |
|
— |
2025-04-20 |
| PR |
0.00 |
chore: auto detect mps for pre train |
|
— |
2025-04-05 |
| PR |
0.00 |
Add Load ckpt |
|
— |
2025-04-03 |
| PR |
0.00 |
Little typo of readme |
|
— |
2025-03-09 |
| PR |
0.00 |
Add the interface testing interface for model API deployment |
|
— |
2025-02-21 |
| PR |
0.00 |
add smart gradient accumulation |
|
— |
2025-02-21 |
| PR |
0.00 |
Add ckp_dir and tokenizer path |
|
— |
2025-02-18 |
| PR |
0.00 |
修正了训练tokenizer中的chat_template中的逻辑,以及修正了tokenizer_config.json |
|
— |
2024-11-07 |
| PR |
0.00 |
Stabilize full SFT |
|
— |
2025-10-23 |
| PR |
0.00 |
接续训练 |
|
— |
2025-05-28 |
| PR |
0.00 |
Fix Flash Attention attn_mask and is_causal conflict in Atte |
|
— |
2025-10-18 |
| PR |
0.00 |
Minimind |
|
— |
2025-10-10 |
| PR |
0.00 |
主要增加了直接使用huggingface模型的适配 |
|
— |
2025-05-25 |
| PR |
0.00 |
update |
|
— |
2025-06-28 |
| PR |
0.00 |
Fix bug #329 top_p 参数由 int 类型调整为 float |
|
— |
2025-04-09 |
| PR |
0.00 |
123 |
|
— |
2025-04-17 |
| PR |
0.00 |
移除构建输入文本时在开头和末尾重复添加的和 |
|
— |
2025-04-02 |
| PR |
0.00 |
feat: 优化导入和代码风格 |
|
— |
2025-02-11 |
| PR |
0.00 |
Update README.md |
|
— |
2025-02-07 |
| PR |
0.00 |
fix weight initialization for residual block |
|
— |
2025-02-03 |
| PR |
0.00 |
fix 5-dpo train |
|
— |
2025-01-31 |
| PR |
0.00 |
Remove unnecessary code. |
|
— |
2024-12-03 |
| PR |
0.00 |
Update 5-dpo_train.py |
|
— |
2024-11-14 |
| PR |
0.00 |
fix 5-dpo_train.py bugs |
|
— |
2024-10-11 |
| PR |
0.00 |
修复wandb bug & 添加了argparse |
|
— |
2024-09-24 |
| PR |
0.00 |
添加了wandb |
|
— |
2024-09-23 |
| PR |
0.00 |
修复了data_process.py文件的bug |
|
— |
2024-09-23 |
| PR |
0.00 |
Update requirements.txt |
|
— |
2024-09-05 |