<alibaba_dtdataops_taskid_diagnostic_create_response>
<trace_id>21048d5a17423906420772844d2c2b</trace_id>
<data>
<is_error>false</is_error>
<performance_risk_list>
<performance_risk_list>
<influence_duration>315</influence_duration>
<is_rerun>false</is_rerun>
<suggestion>偶发问题建议保持观察,持续出现,建议联系值班同学确认rerun原因:ODPS-0010000:System internal error - kTimeout: Connection to worker machine bd-odps011014145170.na63 lost, the machine may be problematic or have restarted.,进行优化。 [联系值班](dingtalk://dingtalkclient/page/link?url=https://links.alipay.com/app/room/60010d58695654059768264e/&pc_slide=true)</suggestion>
<name>Rerun导致instance耗时过长</name>
<is_auto>false</is_auto>
<rerun_task_id>80534937282</rerun_task_id>
<object_id>20250318155913918g2he5nh0pam6</object_id>
<object_type>LOGVIEW</object_type>
<desc>Odps/data_stability_20250318155913918g2he5nh0pam6_SQL_0_0_0_job_0/M2#2636_0阶段因rerun多耗时: 315秒</desc>
</performance_risk_list>
</performance_risk_list>
<id>776951</id>
<node_risk_list>
<node_risk_list>
<is_rerun>false</is_rerun>
<suggestion>代码中删除参数: set odps.sql.mapper.memory=4096 或调整为稍大于实际使用内存的648数值</suggestion>
<name>用户设置的参数过大导致资源浪费</name>
<is_auto>true</is_auto>
<object_id>78798791434.0</object_id>
<object_type>TASK</object_type>
<desc>1个Mapper节点使用最多的内存:648.0小于申请的内存:4096.0,导致因申请内存过大而变慢,请剔除或修改参数!</desc>
</node_risk_list>
</node_risk_list>
<url>https://pre-tuzhi.alibaba-inc.com/data-daemon/govern/diagnosticTool/taskDiagnosis/detail?taskId=776951</url>
<error_risk_list>
<error_risk_list>
<influence_duration>0</influence_duration>
<is_rerun>false</is_rerun>
<suggestion>原因:在odps作业的运行过程当中,任务的某个worker实际使用的资源超过了申请资源导致被kill。 建议如下: 1. 可[参考文档](https://aliyuque.antfin.com/wufang.wq/vp0u4m/qnv85fmxdhcc384p?singleDoc# )定位OOM的worker及解决办法。</suggestion>
<name>运行出错</name>
<is_auto>false</is_auto>
<rerun_task_id>79578257865</rerun_task_id>
<object_id>20250216195027297gaz80w4vjo7</object_id>
<object_type>LOGVIEW</object_type>
<desc>出现模块:common模块的错误,错误码:ODPS-0010000, 部分错误信息:System internal error - fuxi job failed, caused by: kWorkerOutOfMemory(errCode:256) at Odps/sec_aeapp_20250216195027297gaz80w4vjo7_SQL_0_1_0_job_0/R6_5@bd-odps033056071142.sg113#205. Detail error msg: KILL_NAKILL_NA: plan</desc>
</error_risk_list>
</error_risk_list>
<status>SUCCESS</status>
</data>
</alibaba_dtdataops_taskid_diagnostic_create_response>
{
"alibaba_dtdataops_taskid_diagnostic_create_response":{
"trace_id":"21048d5a17423906420772844d2c2b",
"data":{
"is_error":false,
"performance_risk_list":{
"performance_risk_list":[
{
"influence_duration":315,
"is_rerun":false,
"suggestion":"偶发问题建议保持观察,持续出现,建议联系值班同学确认rerun原因:ODPS-0010000:System internal error - kTimeout: Connection to worker machine bd-odps011014145170.na63 lost, the machine may be problematic or have restarted.,进行优化。 [联系值班](dingtalk:\/\/dingtalkclient\/page\/link?url=https:\/\/links.alipay.com\/app\/room\/60010d58695654059768264e\/&pc_slide=true)",
"name":"Rerun导致instance耗时过长",
"is_auto":false,
"rerun_task_id":80534937282,
"object_id":"20250318155913918g2he5nh0pam6",
"object_type":"LOGVIEW",
"desc":"Odps\/data_stability_20250318155913918g2he5nh0pam6_SQL_0_0_0_job_0\/M2#2636_0阶段因rerun多耗时: 315秒"
}
]
},
"id":776951,
"node_risk_list":{
"node_risk_list":[
{
"is_rerun":false,
"suggestion":"代码中删除参数: set odps.sql.mapper.memory=4096 或调整为稍大于实际使用内存的648数值",
"name":"用户设置的参数过大导致资源浪费",
"is_auto":true,
"object_id":"78798791434.0",
"object_type":"TASK",
"desc":"1个Mapper节点使用最多的内存:648.0小于申请的内存:4096.0,导致因申请内存过大而变慢,请剔除或修改参数!"
}
]
},
"url":"https:\/\/pre-tuzhi.alibaba-inc.com\/data-daemon\/govern\/diagnosticTool\/taskDiagnosis\/detail?taskId=776951",
"error_risk_list":{
"error_risk_list":[
{
"influence_duration":0,
"is_rerun":false,
"suggestion":"原因:在odps作业的运行过程当中,任务的某个worker实际使用的资源超过了申请资源导致被kill。 建议如下: 1. 可[参考文档](https:\/\/aliyuque.antfin.com\/wufang.wq\/vp0u4m\/qnv85fmxdhcc384p?singleDoc# )定位OOM的worker及解决办法。",
"name":"运行出错",
"is_auto":false,
"rerun_task_id":79578257865,
"object_id":"20250216195027297gaz80w4vjo7",
"object_type":"LOGVIEW",
"desc":"出现模块:common模块的错误,错误码:ODPS-0010000, 部分错误信息:System internal error - fuxi job failed, caused by: kWorkerOutOfMemory(errCode:256) at Odps\/sec_aeapp_20250216195027297gaz80w4vjo7_SQL_0_1_0_job_0\/R6_5@bd-odps033056071142.sg113#205. Detail error msg: KILL_NAKILL_NA: plan "
}
]
},
"status":"SUCCESS"
}
}
}