relyt0925 · August 18, 2024 22:02
diff --git a/gistfile1.txt b/gistfile1.txt
 [root@tyler-a100-newimage-val root]# /root/bin/ilab.sh --config /var/mnt/inststg1/instructlab/config.yaml model evaluate --model /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ --base-model /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ --benchmark mmlu_branch  --tasks-dir /var/mnt/inststg1/instructlab/generated/node_datasets_2024-08-18T15_57_14/
 Using local safetensors found at '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/' for '--model'
 INFO 2024-08-18 22:00:17,135 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
 INFO 2024-08-18 22:00:17,135 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
 INFO 2024-08-18 22:00:17,135 numexpr.utils:161: NumExpr defaulting to 16 threads.
 INFO 2024-08-18 22:00:17,797 datasets:58: PyTorch version 2.3.1 available.
 INFO 2024-08-18 22:00:29,170 lm-eval:152: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
 INFO 2024-08-18 22:00:29,171 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/', 'dtype': 'bfloat16'}
 INFO 2024-08-18 22:00:29,356 lm-eval:170: Using device 'cuda'
 Generating test split: 108 examples [00:00, 4558.66 examples/s]
 WARNING 2024-08-18 22:00:34,788 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
 WARNING 2024-08-18 22:00:34,788 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
 WARNING 2024-08-18 22:00:34,808 lm-eval:251: Overwriting default num_fewshot of knowledge_compliance_personally-identifiable-information from None to 5
 INFO 2024-08-18 22:00:34,808 lm-eval:261: Setting fewshot random generator seed to 1234
 INFO 2024-08-18 22:00:34,809 lm-eval:411: Building contexts for knowledge_compliance_personally-identifiable-information on rank 0...
 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:01<00:00, 71.45it/s]
 INFO 2024-08-18 22:00:36,335 lm-eval:438: Running loglikelihood requests
 Running loglikelihood requests:   0%|                                                                                                                                                                                      | 0/432 [00:00<?, ?it/s]Passed argument batch_size = auto:1. Detecting largest batch size
 We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
 Determined largest batch size: 64
 Running loglikelihood requests: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:32<00:00, 13.16it/s]
 WARNING 2024-08-18 22:01:10,217 lm-eval:1315: Failed to get model SHA for /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ at revision main. Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/'. Use `repo_type` argument if needed.
 fatal: not a git repository (or any of the parent directories): .git
 INFO 2024-08-18 22:01:19,669 lm-eval:152: Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234
 INFO 2024-08-18 22:01:19,669 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/', 'dtype': 'bfloat16'}
 INFO 2024-08-18 22:01:19,671 lm-eval:170: Using device 'cuda'
 Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:06<00:00,  2.04s/it]
 WARNING 2024-08-18 22:01:26,005 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
 WARNING 2024-08-18 22:01:26,006 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
 WARNING 2024-08-18 22:01:26,024 lm-eval:251: Overwriting default num_fewshot of knowledge_compliance_personally-identifiable-information from None to 5
 INFO 2024-08-18 22:01:26,024 lm-eval:261: Setting fewshot random generator seed to 1234
 INFO 2024-08-18 22:01:26,024 lm-eval:411: Building contexts for knowledge_compliance_personally-identifiable-information on rank 0...
 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 108/108 [00:01<00:00, 70.26it/s]
 INFO 2024-08-18 22:01:27,577 lm-eval:438: Running loglikelihood requests
 Running loglikelihood requests:   0%|                                                                                                                                                                                      | 0/432 [00:00<?, ?it/s]Passed argument batch_size = auto:1. Detecting largest batch size
 Determined largest batch size: 64
 Running loglikelihood requests: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 432/432 [00:31<00:00, 13.56it/s]
 WARNING 2024-08-18 22:02:00,506 lm-eval:1315: Failed to get model SHA for /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ at revision main. Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/'. Use `repo_type` argument if needed.
 fatal: not a git repository (or any of the parent directories): .git
 # KNOWLEDGE EVALUATION REPORT

 ## BASE MODEL
 /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/

 ## MODEL
 /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/

 ### AVERAGE:
 -0.02 (across 1)

 ### REGRESSIONS:
 1. knowledge_compliance_personally-identifiable-information (-0.02)
 [root@tyler-a100-newimage-val root]#
	[root@tyler-a100-newimage-val root]# /root/bin/ilab.sh --config /var/mnt/inststg1/instructlab/config.yaml model evaluate --model /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ --base-model /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ --benchmark mmlu_branch --tasks-dir /var/mnt/inststg1/instructlab/generated/node_datasets_2024-08-18T15_57_14/
	Using local safetensors found at '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/' for '--model'
	INFO 2024-08-18 22:00:17,135 numexpr.utils:145: Note: detected 80 virtual cores but NumExpr set to maximum of 64, check "NUMEXPR_MAX_THREADS" environment variable.
	INFO 2024-08-18 22:00:17,135 numexpr.utils:148: Note: NumExpr detected 80 cores but "NUMEXPR_MAX_THREADS" not set, so enforcing safe limit of 16.
	INFO 2024-08-18 22:00:17,135 numexpr.utils:161: NumExpr defaulting to 16 threads.
	INFO 2024-08-18 22:00:17,797 datasets:58: PyTorch version 2.3.1 available.
	INFO 2024-08-18 22:00:29,170 lm-eval:152: Setting random seed to 0 \| Setting numpy seed to 1234 \| Setting torch manual seed to 1234
	INFO 2024-08-18 22:00:29,171 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/', 'dtype': 'bfloat16'}
	INFO 2024-08-18 22:00:29,356 lm-eval:170: Using device 'cuda'
	Generating test split: 108 examples [00:00, 4558.66 examples/s]
	WARNING 2024-08-18 22:00:34,788 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
	WARNING 2024-08-18 22:00:34,788 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
	WARNING 2024-08-18 22:00:34,808 lm-eval:251: Overwriting default num_fewshot of knowledge_compliance_personally-identifiable-information from None to 5
	INFO 2024-08-18 22:00:34,808 lm-eval:261: Setting fewshot random generator seed to 1234
	INFO 2024-08-18 22:00:34,809 lm-eval:411: Building contexts for knowledge_compliance_personally-identifiable-information on rank 0...
	100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 108/108 [00:01<00:00, 71.45it/s]
	INFO 2024-08-18 22:00:36,335 lm-eval:438: Running loglikelihood requests
	Running loglikelihood requests: 0%\| \| 0/432 [00:00<?, ?it/s]Passed argument batch_size = auto:1. Detecting largest batch size
	We detected that you are passing `past_key_values` as a tuple and this is deprecated and will be removed in v4.43. Please use an appropriate `Cache` class (https://huggingface.co/docs/transformers/v4.41.3/en/internal/generation_utils#transformers.Cache)
	Determined largest batch size: 64
	Running loglikelihood requests: 100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 432/432 [00:32<00:00, 13.16it/s]
	WARNING 2024-08-18 22:01:10,217 lm-eval:1315: Failed to get model SHA for /var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/ at revision main. Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/'. Use `repo_type` argument if needed.
	fatal: not a git repository (or any of the parent directories): .git
	INFO 2024-08-18 22:01:19,669 lm-eval:152: Setting random seed to 0 \| Setting numpy seed to 1234 \| Setting torch manual seed to 1234
	INFO 2024-08-18 22:01:19,669 lm-eval:189: Initializing hf model, with arguments: {'pretrained': '/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/', 'dtype': 'bfloat16'}
	INFO 2024-08-18 22:01:19,671 lm-eval:170: Using device 'cuda'
	Loading checkpoint shards: 100%\|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 3/3 [00:06<00:00, 2.04s/it]
	WARNING 2024-08-18 22:01:26,005 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
	WARNING 2024-08-18 22:01:26,006 lm-eval:325: [Task: knowledge_compliance_personally-identifiable-information] has_training_docs and has_validation_docs are False, using test_docs as fewshot_docs but this is not recommended.
	WARNING 2024-08-18 22:01:26,024 lm-eval:251: Overwriting default num_fewshot of knowledge_compliance_personally-identifiable-information from None to 5
	INFO 2024-08-18 22:01:26,024 lm-eval:261: Setting fewshot random generator seed to 1234
	INFO 2024-08-18 22:01:26,024 lm-eval:411: Building contexts for knowledge_compliance_personally-identifiable-information on rank 0...
	100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 108/108 [00:01<00:00, 70.26it/s]
	INFO 2024-08-18 22:01:27,577 lm-eval:438: Running loglikelihood requests
	Running loglikelihood requests: 0%\| \| 0/432 [00:00<?, ?it/s]Passed argument batch_size = auto:1. Detecting largest batch size
	Determined largest batch size: 64
	Running loglikelihood requests: 100%\|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████\| 432/432 [00:31<00:00, 13.56it/s]
	WARNING 2024-08-18 22:02:00,506 lm-eval:1315: Failed to get model SHA for /var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/ at revision main. Error: Repo id must be in the form 'repo_name' or 'namespace/repo_name': '/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/'. Use `repo_type` argument if needed.
	fatal: not a git repository (or any of the parent directories): .git
	# KNOWLEDGE EVALUATION REPORT

	## BASE MODEL
	/var/mnt/inststg1/instructlab/models/granite-7b-starter1.1/

	## MODEL
	/var/mnt/inststg1/instructlab/phasedbasedir/phase2/checkpoints/hf_format/samples_25376/

	### AVERAGE:
	-0.02 (across 1)

	### REGRESSIONS:
	1. knowledge_compliance_personally-identifiable-information (-0.02)
	[root@tyler-a100-newimage-val root]#