Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix SRN algorithm infer error #13851

Merged
merged 1 commit into from
Sep 11, 2024
Merged

Conversation

GreatV
Copy link
Collaborator

@GreatV GreatV commented Sep 10, 2024

Close #12043

Previously inference using the SRN algorithm would report an error, W0910 03:27:08.048635 9645 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. . So we may need to remove the pass gpu_cpu_map_matmul_to_mul_pass.

python3 tools/infer/predict_rec.py --image_dir="./doc/imgs_words/en/word_1.png" --rec_model_dir="./inference/rec_srn/" --rec_image_shape="1,64,256"  --rec_algorithm="SRN" --rec_char_dict_path=./ppocr/utils/ic15_dict.txt  --use_space_char=False
[2024/09/10 03:27:06] ppocr WARNING: The first GPU is used for inference by default, GPU ID: 0
�[1m�[35m--- Running analysis [ir_graph_build_pass]�[0m
I0910 03:27:06.930096  9645 executor.cc:183] Old Executor is Running.
�[1m�[35m--- Running analysis [ir_analysis_pass]�[0m
�[32m--- Running IR pass [map_op_to_another_pass]�[0m
I0910 03:27:06.992350  9645 fuse_pass_base.cc:59] ---  detected 58 subgraphs
�[32m--- Running IR pass [is_test_pass]�[0m
�[32m--- Running IR pass [simplify_with_basic_ops_pass]�[0m
�[32m--- Running IR pass [delete_quant_dequant_linear_op_pass]�[0m
�[32m--- Running IR pass [delete_weight_dequant_linear_op_pass]�[0m
�[32m--- Running IR pass [sparse_conv_optim_pass]�[0m
�[32m--- Running IR pass [constant_folding_pass]�[0m
I0910 03:27:07.039814  9645 fuse_pass_base.cc:59] ---  detected 5 subgraphs
�[32m--- Running IR pass [silu_fuse_pass]�[0m
�[32m--- Running IR pass [conv_bn_fuse_pass]�[0m
I0910 03:27:07.069940  9645 fuse_pass_base.cc:59] ---  detected 53 subgraphs
�[32m--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]�[0m
W0910 03:27:07.083305  9645 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_eltwiseadd_bn_fuse_pass) failed!
W0910 03:27:07.083314  9645 conv_bn_fuse_pass.cc:644] Pass in op compat failed.
W0910 03:27:07.083319  9645 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_eltwiseadd_bn_fuse_pass) failed!
W0910 03:27:07.083323  9645 conv_bn_fuse_pass.cc:644] Pass in op compat failed.
�[32m--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]�[0m
�[32m--- Running IR pass [multihead_matmul_fuse_pass_v2]�[0m
�[32m--- Running IR pass [vit_attention_fuse_pass]�[0m
�[32m--- Running IR pass [fused_multi_transformer_encoder_pass]�[0m
�[32m--- Running IR pass [fused_multi_transformer_decoder_pass]�[0m
�[32m--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]�[0m
�[32m--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]�[0m
�[32m--- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass]�[0m
�[32m--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]�[0m
�[32m--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]�[0m
�[32m--- Running IR pass [fuse_multi_transformer_layer_pass]�[0m
�[32m--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]�[0m
�[32m--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]�[0m
�[32m--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]�[0m
�[32m--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]�[0m
I0910 03:27:08.046818  9645 fuse_pass_base.cc:59] ---  detected 65 subgraphs
�[32m--- Running IR pass [gpu_cpu_map_matmul_v2_to_matmul_pass]�[0m
W0910 03:27:08.048635  9645 gpu_cpu_map_matmul_to_mul_pass.cc:425] matmul op not support broadcast, please check inputs'shape. 
I0910 03:27:08.048646  9645 fuse_pass_base.cc:59] ---  detected 21 subgraphs
�[32m--- Running IR pass [matmul_scale_fuse_pass]�[0m
I0910 03:27:08.050783  9645 fuse_pass_base.cc:59] ---  detected 11 subgraphs
�[32m--- Running IR pass [multihead_matmul_fuse_pass_v3]�[0m
�[32m--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]�[0m
�[32m--- Running IR pass [fc_fuse_pass]�[0m
I0910 03:27:08.110085  9645 fuse_pass_base.cc:59] ---  detected 24 subgraphs
�[32m--- Running IR pass [fc_elementwise_layernorm_fuse_pass]�[0m
�[32m--- Running IR pass [conv_elementwise_add_act_fuse_pass]�[0m
I0910 03:27:08.128873  9645 fuse_pass_base.cc:59] ---  detected 33 subgraphs
�[32m--- Running IR pass [conv_elementwise_add2_act_fuse_pass]�[0m
I0910 03:27:08.134982  9645 fuse_pass_base.cc:59] ---  detected 4 subgraphs
�[32m--- Running IR pass [conv_elementwise_add_fuse_pass]�[0m
W0910 03:27:08.137372  9645 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:27:08.137379  9645 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
W0910 03:27:08.137382  9645 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:27:08.137393  9645 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
W0910 03:27:08.137398  9645 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:27:08.137400  9645 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
W0910 03:27:08.137404  9645 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:27:08.137408  9645 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
W0910 03:27:08.137410  9645 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:27:08.137413  9645 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
I0910 03:27:08.138523  9645 fuse_pass_base.cc:59] ---  detected 16 subgraphs
�[32m--- Running IR pass [transpose_flatten_concat_fuse_pass]�[0m
�[32m--- Running IR pass [transfer_layout_pass]�[0m
�[32m--- Running IR pass [transfer_layout_elim_pass]�[0m
�[32m--- Running IR pass [auto_mixed_precision_pass]�[0m
�[32m--- Running IR pass [identity_op_clean_pass]�[0m
I0910 03:27:08.163686  9645 fuse_pass_base.cc:59] ---  detected 4 subgraphs
�[32m--- Running IR pass [inplace_op_var_pass]�[0m
I0910 03:27:08.164572  9645 fuse_pass_base.cc:59] ---  detected 55 subgraphs
�[1m�[35m--- Running analysis [ir_params_sync_among_devices_pass]�[0m
I0910 03:27:08.165179  9645 ir_params_sync_among_devices_pass.cc:49] Sync params from CPU to GPU
�[1m�[35m--- Running analysis [adjust_cudnn_workspace_size_pass]�[0m
�[1m�[35m--- Running analysis [inference_op_replace_pass]�[0m
�[1m�[35m--- Running analysis [save_optimized_model_pass]�[0m
�[1m�[35m--- Running analysis [ir_graph_to_program_pass]�[0m
I0910 03:27:08.249032  9645 analysis_predictor.cc:2300] ======= ir optimization completed =======
I0910 03:27:08.249928  9645 naive_executor.cc:207] ---  skip [feed], feed -> data_3
I0910 03:27:08.249934  9645 naive_executor.cc:207] ---  skip [feed], feed -> data_2
I0910 03:27:08.249938  9645 naive_executor.cc:207] ---  skip [feed], feed -> data_1
I0910 03:27:08.249941  9645 naive_executor.cc:207] ---  skip [feed], feed -> data_0
I0910 03:27:08.249944  9645 naive_executor.cc:207] ---  skip [feed], feed -> x
I0910 03:27:08.250773  9645 naive_executor.cc:207] ---  skip [save_infer_model/scale_0.tmp_0], fetch -> fetch
I0910 03:27:08.250779  9645 naive_executor.cc:207] ---  skip [matmul_v2_21.tmp_0], fetch -> fetch
I0910 03:27:08.250782  9645 naive_executor.cc:207] ---  skip [save_infer_model/scale_2.tmp_0], fetch -> fetch
I0910 03:27:08.250787  9645 naive_executor.cc:207] ---  skip [save_infer_model/scale_3.tmp_0], fetch -> fetch
I0910 03:27:08.250790  9645 naive_executor.cc:207] ---  skip [save_infer_model/scale_4.tmp_0], fetch -> fetch
[2024/09/10 03:27:08] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320
W0910 03:27:08.251653  9645 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.4, Runtime API Version: 12.3
W0910 03:27:08.252079  9645 gpu_resources.cc:164] device: 0, cuDNN Version: 9.0.
I0910 03:27:08.252254  9645 program_interpreter.cc:243] New Executor is Running.


--------------------------------------
C++ Traceback (most recent call last):
--------------------------------------
0   paddle::AnalysisPredictor::ZeroCopyRun(bool)
1   paddle::framework::NaiveExecutor::RunInterpreterCore(std::vector<std::string, std::allocator<std::string > > const&, bool, bool)
2   paddle::framework::InterpreterCore::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool)
3   paddle::framework::ProgramInterpreter::Run(std::vector<std::string, std::allocator<std::string > > const&, bool, bool, bool, bool)
4   paddle::framework::ProgramInterpreter::Build(std::vector<std::string, std::allocator<std::string > > const&, std::vector<paddle::framework::OpFuncNode, std::allocator<paddle::framework::OpFuncNode> >*, bool)
5   paddle::framework::interpreter::BuildOpFuncList(phi::Place const&, paddle::framework::BlockDesc const&, std::set<std::string, std::less<std::string >, std::allocator<std::string > > const&, std::vector<paddle::framework::OpFuncNode, std::allocator<paddle::framework::OpFuncNode> >*, paddle::framework::VariableScope*, paddle::framework::interpreter::ExecutionConfig const&, std::vector<std::function<void (paddle::framework::OperatorBase*, paddle::framework::Scope*)>, std::allocator<std::function<void (paddle::framework::OperatorBase*, paddle::framework::Scope*)> > > const&, std::vector<std::function<void (paddle::framework::OperatorBase*, paddle::framework::Scope*)>, std::allocator<std::function<void (paddle::framework::OperatorBase*, paddle::framework::Scope*)> > > const&, bool, bool)
6   paddle::operators::Reshape2Op::InferShape(paddle::framework::InferShapeContext*) const

----------------------
Error Message Summary:
----------------------
FatalError: `Segmentation fault` is detected by the operating system.
  [TimeInfo: *** Aborted at 1725938828 (unix time) try "date -d @1725938828" if you are using GNU date ***]
  [SignalInfo: *** SIGSEGV (@0x0) received by PID 9645 (TID 0x7fa41263a740) from PID 0 ***]

We were able to run it successfully after modifying the code.

[2024/09/10 03:28:49] ppocr WARNING: The first GPU is used for inference by default, GPU ID: 0
�[1m�[35m--- Running analysis [ir_graph_build_pass]�[0m
I0910 03:28:49.224186  9852 executor.cc:183] Old Executor is Running.
�[1m�[35m--- Running analysis [ir_analysis_pass]�[0m
�[32m--- Running IR pass [map_op_to_another_pass]�[0m
I0910 03:28:49.286415  9852 fuse_pass_base.cc:59] ---  detected 58 subgraphs
�[32m--- Running IR pass [is_test_pass]�[0m
�[32m--- Running IR pass [simplify_with_basic_ops_pass]�[0m
�[32m--- Running IR pass [delete_quant_dequant_linear_op_pass]�[0m
�[32m--- Running IR pass [delete_weight_dequant_linear_op_pass]�[0m
�[32m--- Running IR pass [sparse_conv_optim_pass]�[0m
�[32m--- Running IR pass [constant_folding_pass]�[0m
I0910 03:28:49.332994  9852 fuse_pass_base.cc:59] ---  detected 5 subgraphs
�[32m--- Running IR pass [silu_fuse_pass]�[0m
�[32m--- Running IR pass [conv_bn_fuse_pass]�[0m
I0910 03:28:49.363368  9852 fuse_pass_base.cc:59] ---  detected 53 subgraphs
�[32m--- Running IR pass [conv_eltwiseadd_bn_fuse_pass]�[0m
W0910 03:28:49.376542  9852 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_eltwiseadd_bn_fuse_pass) failed!
W0910 03:28:49.376551  9852 conv_bn_fuse_pass.cc:644] Pass in op compat failed.
W0910 03:28:49.376559  9852 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_eltwiseadd_bn_fuse_pass) failed!
W0910 03:28:49.376562  9852 conv_bn_fuse_pass.cc:644] Pass in op compat failed.
�[32m--- Running IR pass [embedding_eltwise_layernorm_fuse_pass]�[0m
�[32m--- Running IR pass [multihead_matmul_fuse_pass_v2]�[0m
�[32m--- Running IR pass [vit_attention_fuse_pass]�[0m
�[32m--- Running IR pass [fused_multi_transformer_encoder_pass]�[0m
�[32m--- Running IR pass [fused_multi_transformer_decoder_pass]�[0m
�[32m--- Running IR pass [fused_multi_transformer_encoder_fuse_qkv_pass]�[0m
�[32m--- Running IR pass [fused_multi_transformer_decoder_fuse_qkv_pass]�[0m
�[32m--- Running IR pass [multi_devices_fused_multi_transformer_encoder_pass]�[0m
�[32m--- Running IR pass [multi_devices_fused_multi_transformer_encoder_fuse_qkv_pass]�[0m
�[32m--- Running IR pass [multi_devices_fused_multi_transformer_decoder_fuse_qkv_pass]�[0m
�[32m--- Running IR pass [fuse_multi_transformer_layer_pass]�[0m
�[32m--- Running IR pass [gpu_cpu_squeeze2_matmul_fuse_pass]�[0m
�[32m--- Running IR pass [gpu_cpu_reshape2_matmul_fuse_pass]�[0m
�[32m--- Running IR pass [gpu_cpu_flatten2_matmul_fuse_pass]�[0m
�[32m--- Running IR pass [gpu_cpu_map_matmul_v2_to_mul_pass]�[0m
I0910 03:28:50.330798  9852 fuse_pass_base.cc:59] ---  detected 65 subgraphs
�[32m--- Running IR pass [matmul_scale_fuse_pass]�[0m
�[32m--- Running IR pass [multihead_matmul_fuse_pass_v3]�[0m
�[32m--- Running IR pass [gpu_cpu_map_matmul_to_mul_pass]�[0m
�[32m--- Running IR pass [fc_fuse_pass]�[0m
I0910 03:28:50.390921  9852 fuse_pass_base.cc:59] ---  detected 24 subgraphs
�[32m--- Running IR pass [fc_elementwise_layernorm_fuse_pass]�[0m
�[32m--- Running IR pass [conv_elementwise_add_act_fuse_pass]�[0m
I0910 03:28:50.409237  9852 fuse_pass_base.cc:59] ---  detected 33 subgraphs
�[32m--- Running IR pass [conv_elementwise_add2_act_fuse_pass]�[0m
I0910 03:28:50.415278  9852 fuse_pass_base.cc:59] ---  detected 4 subgraphs
�[32m--- Running IR pass [conv_elementwise_add_fuse_pass]�[0m
W0910 03:28:50.417644  9852 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:28:50.417652  9852 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
W0910 03:28:50.417657  9852 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:28:50.417660  9852 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
W0910 03:28:50.417665  9852 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:28:50.417668  9852 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
W0910 03:28:50.417680  9852 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:28:50.417685  9852 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
W0910 03:28:50.417688  9852 op_compat_sensible_pass.cc:232]  Check the Attr(axis) of Op(elementwise_add) in pass(conv_elementwise_add_fuse_pass) failed!
W0910 03:28:50.417692  9852 conv_elementwise_add_fuse_pass.cc:97] Pass in op compat failed.
I0910 03:28:50.418787  9852 fuse_pass_base.cc:59] ---  detected 16 subgraphs
�[32m--- Running IR pass [transpose_flatten_concat_fuse_pass]�[0m
�[32m--- Running IR pass [transfer_layout_pass]�[0m
�[32m--- Running IR pass [transfer_layout_elim_pass]�[0m
�[32m--- Running IR pass [auto_mixed_precision_pass]�[0m
�[32m--- Running IR pass [identity_op_clean_pass]�[0m
I0910 03:28:50.444590  9852 fuse_pass_base.cc:59] ---  detected 4 subgraphs
�[32m--- Running IR pass [inplace_op_var_pass]�[0m
I0910 03:28:50.445502  9852 fuse_pass_base.cc:59] ---  detected 54 subgraphs
�[1m�[35m--- Running analysis [ir_params_sync_among_devices_pass]�[0m
I0910 03:28:50.446126  9852 ir_params_sync_among_devices_pass.cc:49] Sync params from CPU to GPU
�[1m�[35m--- Running analysis [adjust_cudnn_workspace_size_pass]�[0m
�[1m�[35m--- Running analysis [inference_op_replace_pass]�[0m
�[1m�[35m--- Running analysis [save_optimized_model_pass]�[0m
�[1m�[35m--- Running analysis [ir_graph_to_program_pass]�[0m
I0910 03:28:50.529299  9852 analysis_predictor.cc:2300] ======= ir optimization completed =======
I0910 03:28:50.530289  9852 naive_executor.cc:207] ---  skip [feed], feed -> data_3
I0910 03:28:50.530297  9852 naive_executor.cc:207] ---  skip [feed], feed -> data_2
I0910 03:28:50.530300  9852 naive_executor.cc:207] ---  skip [feed], feed -> data_1
I0910 03:28:50.530304  9852 naive_executor.cc:207] ---  skip [feed], feed -> data_0
I0910 03:28:50.530308  9852 naive_executor.cc:207] ---  skip [feed], feed -> x
I0910 03:28:50.531141  9852 naive_executor.cc:207] ---  skip [save_infer_model/scale_0.tmp_0], fetch -> fetch
I0910 03:28:50.531147  9852 naive_executor.cc:207] ---  skip [matmul_v2_21.tmp_0], fetch -> fetch
I0910 03:28:50.531150  9852 naive_executor.cc:207] ---  skip [save_infer_model/scale_2.tmp_0], fetch -> fetch
I0910 03:28:50.531154  9852 naive_executor.cc:207] ---  skip [save_infer_model/scale_3.tmp_0], fetch -> fetch
I0910 03:28:50.531158  9852 naive_executor.cc:207] ---  skip [save_infer_model/scale_4.tmp_0], fetch -> fetch
[2024/09/10 03:28:50] ppocr INFO: In PP-OCRv3, rec_image_shape parameter defaults to '3, 48, 320', if you are using recognition model with PP-OCRv2 or an older version, please set --rec_image_shape='3,32,320
W0910 03:28:50.532030  9852 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 8.6, Driver API Version: 12.4, Runtime API Version: 12.3
W0910 03:28:50.532446  9852 gpu_resources.cc:164] device: 0, cuDNN Version: 9.0.
I0910 03:28:50.532619  9852 program_interpreter.cc:243] New Executor is Running.
[2024/09/10 03:28:50] ppocr INFO: Predicts of ./doc/imgs_words/en/word_1.png:('joint', 1.0)

@UserWangZz
Copy link
Collaborator

image
In windows env, there is nothing output. No error No output.
I use paddle2.6.1.post112

Copy link
Collaborator

@UserWangZz UserWangZz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use this config can get output.

Copy link
Collaborator

@jzhang533 jzhang533 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jzhang533 jzhang533 merged commit 1c8233d into PaddlePaddle:main Sep 11, 2024
4 checks passed
@GreatV GreatV deleted the fix_srn_infer branch September 11, 2024 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[SRN - PPOCR v4] Error when run predict inference
3 participants