You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm attempting to quantize a YOLOv8n model from the Ultralytics package using MCT GPTQ. However, I encounter this error during the calibration process:
-> 1106 gptq_quant_model, _ = mct.gptq.pytorch_gradient_post_training_quantization(
1107 model=self.model,
1108 representative_data_gen=representative_dataset_gen,
1109 target_resource_utilization=resource_utilization,
1110 gptq_config=gptq_config,
1111 core_config=config,
1112 target_platform_capabilities=tpc)
1114 print('Quantized-GPTQ model is ready')
1116 return f, None
File ~/repos/model_optimization/model_compression_toolkit/gptq/pytorch/quantization_facade.py:196, in pytorch_gradient_post_training_quantization(model, representative_data_gen, target_resource_utilization, core_config, gptq_config, gptq_representative_data_gen, target_platform_capabilities)
191 float_graph = copy.deepcopy(graph)
193 # ---------------------- #
194 # GPTQ Runner
195 # ---------------------- #
--> 196 graph_gptq = gptq_runner(graph,
197 core_config,
198 gptq_config,
199 representative_data_gen,
200 gptq_representative_data_gen if gptq_representative_data_gen else representative_data_gen,
201 DEFAULT_PYTORCH_INFO,
202 fw_impl,
203 tb_w,
204 hessian_info_service=hessian_info_service)
206 if core_config.debug_config.analyze_similarity:
207 analyzer_model_quantization(representative_data_gen,
208 tb_w,
209 float_graph,
210 graph_gptq,
211 fw_impl,
212 DEFAULT_PYTORCH_INFO)
File ~/repos/model_optimization/model_compression_toolkit/gptq/runner.py:115, in gptq_runner(tg, core_config, gptq_config, representative_data_gen, gptq_representative_data_gen, fw_info, fw_impl, tb_w, hessian_info_service)
111 #############################################
112 # Gradient Based Post Training Quantization
113 #############################################
114 Logger.info("Running GPTQ optimization.")
--> 115 tg_gptq = _apply_gptq(gptq_config,
116 gptq_representative_data_gen,
117 tb_w,
118 tg,
119 tg_bias,
120 fw_info,
121 fw_impl,
122 hessian_info_service=hessian_info_service)
124 return tg_gptq
File ~/repos/model_optimization/model_compression_toolkit/gptq/runner.py:62, in _apply_gptq(gptq_config, representative_data_gen, tb_w, tg, tg_bias, fw_info, fw_impl, hessian_info_service)
43 """
44 Apply GPTQ to improve accuracy of quantized model.
45 Build two models from a graph: A teacher network (float model) and a student network (quantized model).
(...)
59
60 """
61 if gptq_config is not None and gptq_config.n_epochs > 0:
---> 62 tg_bias = gptq_training(tg,
63 tg_bias,
64 gptq_config,
65 representative_data_gen,
66 fw_impl,
67 fw_info,
68 hessian_info_service=hessian_info_service)
70 if tb_w is not None:
71 tb_w.add_graph(tg_bias, 'after_gptq')
File ~/repos/model_optimization/model_compression_toolkit/gptq/common/gptq_training.py:287, in gptq_training(graph_float, graph_quant, gptq_config, representative_data_gen, fw_impl, fw_info, hessian_info_service)
278 gptq_trainer = gptq_trainer_obj(graph_float,
279 graph_quant,
280 gptq_config,
(...)
283 representative_data_gen,
284 hessian_info_service=hessian_info_service)
286 # Training process
--> 287 gptq_trainer.train(representative_data_gen)
289 # Update graph
290 graph_quant = gptq_trainer.update_graph()
File ~/repos/model_optimization/model_compression_toolkit/gptq/pytorch/gptq_training.py:193, in PytorchGPTQTrainer.train(self, representative_data_gen)
190 optimizer.add_param_group({'params': params})
192 # Set models mode
--> 193 set_model(self.float_model, False)
194 set_model(self.fxp_model, True)
195 self._set_requires_grad()
File ~/repos/model_optimization/model_compression_toolkit/core/pytorch/utils.py:41, in set_model(model, train_mode)
38 model.eval()
40 device = get_working_device()
---> 41 model.to(device)
TypeError: descriptor 'to' for 'torch._C.TensorBase' objects doesn't apply to a 'torch.device' object
importosimportmodel_compression_toolkitasmctfromtutorials.mct_model_garden.evaluation_metrics.coco_evaluationimportcoco_dataset_generatorfromtutorials.mct_model_garden.models_pytorch.yolov8.yolov8_preprocessimportyolov8_preprocess_chw_transposefromtypingimportIterator, Tuple, ListimportwgetimportzipfileimportloggingDATASET_ROOT="./coco"ifnotos.path.isdir(DATASET_ROOT):
logging.info('Downloading COCO dataset')
os.mkdir(DATASET_ROOT)
wget.download('http://images.cocodataset.org/annotations/annotations_trainval2017.zip')
withzipfile.ZipFile("annotations_trainval2017.zip", 'r') aszip_ref:
zip_ref.extractall(DATASET_ROOT)
os.remove('annotations_trainval2017.zip')
wget.download('http://images.cocodataset.org/zips/val2017.zip')
withzipfile.ZipFile("val2017.zip", 'r') aszip_ref:
zip_ref.extractall(DATASET_ROOT)
os.remove('val2017.zip')
fromultralyticsimportYOLOfromultralytics.nn.modulesimportC2f, Detectmodel=YOLO("yolov8n.pt").modelforminmodel.modules():
ifisinstance(m, C2f):
m.forward=m.forward_fxifisinstance(m, Detect):
m.export=Truem.format="mct"REPRESENTATIVE_DATASET_FOLDER=f'{DATASET_ROOT}/val2017/'REPRESENTATIVE_DATASET_ANNOTATION_FILE=f'{DATASET_ROOT}/annotations/instances_val2017.json'BATCH_SIZE=4n_iters=20# Load representative datasetlogging.info('Loading representative dataset')
representative_dataset=coco_dataset_generator(dataset_folder=REPRESENTATIVE_DATASET_FOLDER,
annotation_file=REPRESENTATIVE_DATASET_ANNOTATION_FILE,
preprocess=yolov8_preprocess_chw_transpose,
batch_size=BATCH_SIZE)
# Define representative dataset generatordefget_representative_dataset(n_iter: int, dataset_loader: Iterator[Tuple]):
""" This function creates a representative dataset generator. The generator yields numpy arrays of batches of shape: [Batch, H, W ,C]. Args: n_iter: number of iterations for MCT to calibrate on Returns: A representative dataset generator """defrepresentative_dataset() ->Iterator[List]:
ds_iter=iter(dataset_loader)
for_inrange(n_iter):
yield [next(ds_iter)[0]]
returnrepresentative_datasetlogging.info('Creating representative dataset generator')
# Get representative dataset generatorrepresentative_dataset_gen=get_representative_dataset(n_iter=n_iters,
dataset_loader=representative_dataset)
# Set IMX500-v1 TPClogging.info('Setting target platform capabilities')
tpc=mct.get_target_platform_capabilities(fw_name="pytorch",
target_platform_name='imx500',
target_platform_version='v1')
# # Specify the necessary configuration for mixed precision quantization. To keep the tutorial brief, we'll use a small set of images and omit the hessian metric for mixed precision calculations. It's important to be aware that this choice may impact the resulting accuracy. mp_config=mct.core.MixedPrecisionQuantizationConfig(num_of_images=5,
use_hessian_based_scores=False)
config=mct.core.CoreConfig(mixed_precision_config=mp_config,
quantization_config=mct.core.QuantizationConfig(shift_negative_activation_correction=True))
# # Define target Resource Utilization for mixed precision weights quantization (75% of 'standard' 8bits quantization)resource_utilization_data=mct.core.pytorch_resource_utilization_data(in_model=model,
representative_data_gen=representative_dataset_gen,
core_config=config,
target_platform_capabilities=tpc)
resource_utilization=mct.core.ResourceUtilization(weights_memory=resource_utilization_data.weights_memory*0.75)
# Specify the necessary configuration for Gradient-Based PTQ.n_gptq_epochs=1000gptq_config=mct.gptq.get_pytorch_gptq_config(n_epochs=n_gptq_epochs, use_hessian_based_weights=False)
# Perform Gradient-Based Post Training Quantizationgptq_quant_model, _=mct.gptq.pytorch_gradient_post_training_quantization(
model=model,
representative_data_gen=representative_dataset_gen,
target_resource_utilization=resource_utilization,
gptq_config=gptq_config,
core_config=config,
target_platform_capabilities=tpc)
print('Quantized-GPTQ model is ready')
Log output
No response
The text was updated successfully, but these errors were encountered:
Hi @ambitious-octopus ,
We have found the root cause for this error. We noticed that your model performs operations on constants, such as ”to” and “mul” operations, which cause failures in MCT. (specifically cause the model.to(device) error).
To be more specific, I think those operations are done in the anchor preparation in your model.
This issue runs deeper, as manipulating constants during model inference can lead to accuracy degradation. Performing these manipulations in advance and using final constant values instead would enhance accuracy and reduce unnecessary calculations. Therefore, we recommend removing constant manipulations from the model and using the finalized constant values instead. This approach should also resolve issue 1189.
Issue Type
Bug
Source
pip (mct-nightly)
MCT Version
PR #1186
OS Platform and Distribution
Linux Ubuntu 22.04
Python version
3.10
Describe the issue
I'm attempting to quantize a YOLOv8n model from the Ultralytics package using MCT GPTQ. However, I encounter this error during the calibration process:
cc: @Idan-BenAmi
Expected behaviour
No response
Code to reproduce the issue
Dependencies:
Code:
Log output
No response
The text was updated successfully, but these errors were encountered: