Skip to content

Instantly share code, notes, and snippets.

View cgpeter96's full-sized avatar
🙏
Everything is OK!

Peter Chan cgpeter96

🙏
Everything is OK!
  • China
View GitHub Profile
# train_grpo.py
from typing import *
import re
import torch
from datasets import load_dataset, Dataset, load_from_disk
from transformers import AutoTokenizer, AutoModelForCausalLM, TrainingArguments
from peft import LoraConfig
from trl import GRPOConfig, GRPOTrainer, TrlParser
from dataclasses import dataclass, field
@cgpeter96
cgpeter96 / tokenization.cpp
Created February 3, 2023 09:15 — forked from luistung/tokenization.cpp
c++ version of bert tokenize
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <unordered_map>
#include <boost/algorithm/string.hpp>
#include <utf8proc.h>
//https://unicode.org/reports/tr15/#Norm_Forms
//https://ssl.icu-project.org/apiref/icu4c/uchar_8h.html
@cgpeter96
cgpeter96 / road_extraction.py
Created February 24, 2021 15:10
街道抽取脚本(简易版)
"""
@desc:
en:the simple code of road name extration
cn:简单街道抽取脚本
@author:peter
@mail:[email protected]
@date:2021/2/24
@note:
该脚本可能存在问题,但由于目前数据就这么多所以就先这样吧,仅供参考。
"""
@cgpeter96
cgpeter96 / distutils.cfg
Created February 25, 2019 15:47
帮助安装pycocoapi
[build]
compiler=msvc
@cgpeter96
cgpeter96 / test.py
Created May 22, 2018 07:40
Hello word
print('Hello word!')