Fixed int8 quantization and added experimental mixed int8/int16 quantization #228

mikljohansson · 2020-09-12T18:00:26Z

Thanks for providing these examples of working with Yolo v4/v3!

I managed to fix the int8 quantization by adding a model.compile() statement to fix the "optimize global tensors" exception. I could also remove overriding supported_ops, by following the examples TensorFlow Lite provides for quantization.

FYI: I'm currently trying to port these models to the K210 / MaixPy MCU, but so far haven't managed to get nncase fully consume the tflite files yet (it doesn't support the SPLIT and DEQUANTIZE op tflite codes).

Note: This only works on the latest tf-nightly (2.4.0+). It doesn't work on tensorflow-2.3.0

It doesn't fully quantize currently, since the network uses some non-quantizable ops (EXP). I've not looked further into that yet.

Best regards,
Mikael

… available on PyPi

…m up with other examples that don't use --framework tflite Added gitignore

…ng EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8

JimBratsos · 2020-09-14T23:06:44Z

Hello, thanks for the great work, will the above fixes be able to fully quantize the yolov4/v3 model in order for it to run at tpu?

mikljohansson · 2020-09-15T17:23:00Z

Hello, thanks for the great work, will the above fixes be able to fully quantize the yolov4/v3 model in order for it to run at tpu?

No, unfortunately it it doesn't fully quantize currently, since the network uses some non-quantizable ops (EXP). I've not looked further into that yet.

When I try with

    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.target_spec.supported_types = [tf.int8, tf.uint8]
    converter.inference_input_type = tf.uint8
    converter.inference_output_type = tf.int8
    converter.representative_dataset = representative_data_gen

I get

RuntimeError: Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.

raryanpur · 2020-09-23T03:10:43Z

@mikljohansson, running your patch set (pulling your forked repo) gives the following error in my environment

  File "convert_tflite.py", line 87, in <module>                                          
    app.run(main)                                                                                                                                                                                                                                                                                                           
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/absl/app.py", line 300, in run                         
    _run_main(main, args)                                                                                                                                                                                                                                
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/absl/app.py", line 251, in _run_main
    sys.exit(main(argv))                                                                                                                                                             
  File "convert_tflite.py", line 82, in main                                                                                                     
    save_tflite()                                                                                                                                                                                                                                                                           
  File "convert_tflite.py", line 56, in save_tflite                          
    tflite_model = converter.convert()                                                                                                                                            
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 892, in convert                                                                                                                                                   
    self).convert(graph_def, input_tensors, output_tensors)            
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 650, in convert
    result = self._calibrate_quantize_model(result, **flags)                                                                                
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/lite.py", line 478, in _calibrate_quantize_model                                                    
    inference_output_type, allow_float, activations_type)                                                  
  File "/Applications/anaconda3/envs/yolov3-tf2-cpu/lib/python3.7/site-packages/tensorflow/lite/python/optimize/calibrator.py", line 98, in calibrate_and_quantize
    np.dtype(activations_type.as_numpy_dtype()).num)                 
RuntimeError: Max and min for dynamic tensors should be recorded during calibration: Failed for tensor input_1
Empty min/max for tensor input_1

Since this depends on tf-nightly, perhaps something has changed in the last 10 days since you made this PR? I'm using tf-nightly:2.4.0-dev20200918 and Python 3.7.0

Note the below was also in the debug log, a good ways before the backtrace information

WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.                                                               
W0922 23:03:29.533921 4621229504 load.py:133] No training configuration found in save file, so the model was *not* compiled. Compile it manually.

mikljohansson · 2020-09-23T03:52:18Z

@mikljohansson, running your patch set (pulling your forked repo) gives the following error in my environment
RuntimeError: Max and min for dynamic tensors should be recorded during calibration: Failed for tensor input_1
Empty min/max for tensor input_1

@raryanpur I think the problem might be perhaps that some file paths are incorrect in the calibration dataset (e.g. ./data/dataset/val2017.txt), I got this error myself and it took me a while to figure out that I had gotten the sample image paths incorrect 😅

I've improved the error reporting for this now, if you pull and try again it might give you a better error message about what's wrong. If it turns out the dataset is missing there's instructions in the README.md about how to download it

Best,
Mikael

raryanpur · 2020-09-23T05:07:15Z

Ah that did the trick - thanks @mikljohansson, works now!

raryanpur · 2020-09-25T03:22:29Z

@mikljohansson when using this quantized model, how are the inputs and outputs scaled? My understanding is that the inputs are still floats, but the values must be scaled from [0.0, 255.0] to [-128.0, 127.0]. Do the outputs (score and box tensor values) need to be scaled as well?

YLTsai0609 · 2020-11-19T08:59:42Z

Hi @mikljohansson , thanks for your great work. After running your modification. I got my yolov3_int_8.tflite model work.

And the message was shown below

[{'name': 'input_1', 'index': 549, 'shape': array([  1, 416, 416,   3], dtype=int32), 'shape_signature': array([ -1, 416, 416,   3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'Identity', 'index': 550, 'shape': array([    1, 10647,     4], dtype=int32), 'shape_signature': array([ 1, -1,  4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'Identity_1', 'index': 551, 'shape': array([    1, 10647,     3], dtype=int32), 'shape_signature': array([ 1, -1,  3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

My question is : Why does it say int32? We actually did a int-8 quantization right?
Is there any resource about it?
Very appreciated. Thanks for your nice work again.

mikljohansson · 2020-11-19T10:26:23Z

@mikljohansson when using this quantized model, how are the inputs and outputs scaled? My understanding is that the inputs are still floats, but the values must be scaled from [0.0, 255.0] to [-128.0, 127.0]. Do the outputs (score and box tensor values) need to be scaled as well?

@raryanpur sorry for not getting back, E-mail got lost in my inbox :(

I honestly don't know, sorry. I haven't dug into the input/output scaling and haven't worked on this model for a while (focusing on other things right now). Hopefully you've been able to work it out already :)

mikljohansson · 2020-11-19T10:33:05Z

Hi @mikljohansson , thanks for your great work. After running your modification. I got my yolov3_int_8.tflite model work.

And the message was shown below

[{'name': 'input_1', 'index': 549, 'shape': array([  1, 416, 416,   3], dtype=int32), 'shape_signature': array([ -1, 416, 416,   3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]
[{'name': 'Identity', 'index': 550, 'shape': array([    1, 10647,     4], dtype=int32), 'shape_signature': array([ 1, -1,  4], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}, {'name': 'Identity_1', 'index': 551, 'shape': array([    1, 10647,     3], dtype=int32), 'shape_signature': array([ 1, -1,  3], dtype=int32), 'dtype': <class 'numpy.float32'>, 'quantization': (0.0, 0), 'quantization_parameters': {'scales': array([], dtype=float32), 'zero_points': array([], dtype=int32), 'quantized_dimension': 0}, 'sparsity_parameters': {}}]

My question is : Why does it say int32? We actually did a int-8 quantization right?
Is there any resource about it?
Very appreciated. Thanks for your nice work again.

Not sure honestly why that is. I could imagine it could be because the network doesn't quantize fully (due to the EXP operator mentioned in earlier comments on this PR).

Perhaps you could try to uncomment these lines in convert_tflite.py and see if it makes a difference?

    #converter.inference_input_type = tf.uint8
    #converter.inference_output_type = tf.int8

This flag might set all intermediate weights and calculations to 8-bit, but I don't think it'd work currently due to the inability to fully quantize the network

converter.target_spec.supported_types = [tf.int8]

Arfinul · 2021-01-08T07:31:38Z

after converting a customized(not coco) yolov3-tiny into .tflite format
i executed the command below

python convert_tflite.py --weights ./checkpoints/yolov4-416 --output ./checkpoints/yolov4-416-int8.tflite --quantize_mode int8 --dataset ./coco_dataset/coco/val207.txt

./checkpoints/yolov4-416 ---> this is not coco model, it is from customized/different dataset

Please suggest, still shall i have to use ./coco_dataset/coco/val207.txt ?
if not, how can i convert my dataset from yolo annotated format to the format of which val207.txt ?

spalani7 · 2021-02-20T15:30:17Z

RuntimeError: Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.
Quantization not yet supported for op: 'EXP'.

I also have the above error, @mikljohansson were you able to fix this ? if yes can you provide the solution. Thanks!

mikljohansson added 4 commits September 12, 2020 19:42

Fixed int8 quantization and clarified instructions for using it

883511f

Updated opencv-python package since the previous version is no longer…

143517c

… available on PyPi

Changed example commands for saving tflite models to avoid mixing the…

5f8f13a

…m up with other examples that don't use --framework tflite Added gitignore

Added experimental support for int16 activations with int8 weigts usi…

9b0edaf

…ng EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8

Improved error reporting when reading calibration sample images

1004955

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed int8 quantization and added experimental mixed int8/int16 quantization #228

Fixed int8 quantization and added experimental mixed int8/int16 quantization #228

mikljohansson commented Sep 12, 2020 •

edited

JimBratsos commented Sep 14, 2020

mikljohansson commented Sep 15, 2020

raryanpur commented Sep 23, 2020 •

edited

mikljohansson commented Sep 23, 2020 •

edited

raryanpur commented Sep 23, 2020

raryanpur commented Sep 25, 2020 •

edited

YLTsai0609 commented Nov 19, 2020

mikljohansson commented Nov 19, 2020

mikljohansson commented Nov 19, 2020

Arfinul commented Jan 8, 2021

spalani7 commented Feb 20, 2021

Fixed int8 quantization and added experimental mixed int8/int16 quantization #228

Are you sure you want to change the base?

Fixed int8 quantization and added experimental mixed int8/int16 quantization #228

Conversation

mikljohansson commented Sep 12, 2020 • edited

JimBratsos commented Sep 14, 2020

mikljohansson commented Sep 15, 2020

raryanpur commented Sep 23, 2020 • edited

mikljohansson commented Sep 23, 2020 • edited

raryanpur commented Sep 23, 2020

raryanpur commented Sep 25, 2020 • edited

YLTsai0609 commented Nov 19, 2020

mikljohansson commented Nov 19, 2020

mikljohansson commented Nov 19, 2020

Arfinul commented Jan 8, 2021

spalani7 commented Feb 20, 2021

mikljohansson commented Sep 12, 2020 •

edited

raryanpur commented Sep 23, 2020 •

edited

mikljohansson commented Sep 23, 2020 •

edited

raryanpur commented Sep 25, 2020 •

edited