Created
August 18, 2024 22:02
-
-
Save relyt0925/dca01c04f85ac86e07aaa2500d418ddf to your computer and use it in GitHub Desktop.
mmlu_branch_eval
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[root@tyler-a100-newimage-val root]# /root/bin/ilab.sh --config /var/mnt/inststg1/instructlab/config.yaml model evaluate --model /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ --base-model /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ --benchmark mmlu_branch --tasks-dir /var/mnt/inststg1/instructlab/generated/node_datasets_2024-08-18T15_57_14/ | |
Using local safetensors found at '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/' for '--model' | |
INFO 2024-08-18 22:00:17,135 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable. | |
INFO 2024-08-18 22:00:17,135 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16. | |
INFO 2024-08-18 22:00:17,135 numexpr.utils:161: NumExpr defaulting to 16 threads. | |
INFO 2024-08-18 22:00:17,797 datasets:58: PyTorch version 2.3.1 available. | |
INFO 2024-08-18 22:00:29,170 lm-eval:152: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | |
INFO 2024-08-18 22:00:29,171 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/', 'dtype': 'bfloat16'} | |
INFO 2024-08-18 22:00:29,356 lm-eval:170: Using device 'cuda' | |
Generating test split: 108 examples [00:00, 4558.66 examples/s] | |
WARNING 2024-08-18 22:00:34,788 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended. | |
WARNING 2024-08-18 22:00:34,788 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended. | |
WARNING 2024-08-18 22:00:34,808 lm-eval:251: Overwriting default num_fewshot of knowledge_compliance_personally-identifiable-information from None to 5 | |
INFO 2024-08-18 22:00:34,808 lm-eval:261: Setting fewshot random generator seed to 1234 | |
INFO 2024-08-18 22:00:34,809 lm-eval:411: Building contexts for knowledge_compliance_personally-identifiable-information on rank 0... | |
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:01<00:00, 71.45it/s] | |
INFO 2024-08-18 22:00:36,335 lm-eval:438: Running loglikelihood requests | |
Running loglikelihood requests: 0%| | 0/432 [00:00<?, ?it/s]Passed argument batch_size = auto:1. Detecting largest batch size | |
We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache) | |
Determined largest batch size: 64 | |
Running loglikelihood requests: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:32<00:00, 13.16it/s] | |
WARNING 2024-08-18 22:01:10,217 lm-eval:1315: Failed to get model SHA for /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ at revision main. Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/'. Use `repo_type` argument if needed. | |
fatal: not a git repository (or any of the parent directories): .git | |
INFO 2024-08-18 22:01:19,669 lm-eval:152: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | |
INFO 2024-08-18 22:01:19,669 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/', 'dtype': 'bfloat16'} | |
INFO 2024-08-18 22:01:19,671 lm-eval:170: Using device 'cuda' | |
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:06<00:00, 2.04s/it] | |
WARNING 2024-08-18 22:01:26,005 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended. | |
WARNING 2024-08-18 22:01:26,006 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended. | |
WARNING 2024-08-18 22:01:26,024 lm-eval:251: Overwriting default num_fewshot of knowledge_compliance_personally-identifiable-information from None to 5 | |
INFO 2024-08-18 22:01:26,024 lm-eval:261: Setting fewshot random generator seed to 1234 | |
INFO 2024-08-18 22:01:26,024 lm-eval:411: Building contexts for knowledge_compliance_personally-identifiable-information on rank 0... | |
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:01<00:00, 70.26it/s] | |
INFO 2024-08-18 22:01:27,577 lm-eval:438: Running loglikelihood requests | |
Running loglikelihood requests: 0%| | 0/432 [00:00<?, ?it/s]Passed argument batch_size = auto:1. Detecting largest batch size | |
Determined largest batch size: 64 | |
Running loglikelihood requests: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:31<00:00, 13.56it/s] | |
WARNING 2024-08-18 22:02:00,506 lm-eval:1315: Failed to get model SHA for /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ at revision main. Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/'. Use `repo_type` argument if needed. | |
fatal: not a git repository (or any of the parent directories): .git | |
# KNOWLEDGE EVALUATION REPORT | |
## BASE MODEL | |
/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ | |
## MODEL | |
/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ | |
### AVERAGE: | |
-0.02 (across 1) | |
### REGRESSIONS: | |
1. knowledge_compliance_personally-identifiable-information (-0.02) | |
[root@tyler-a100-newimage-val root]# |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment