Miguel Afonso Caetano on Nostr: #ML #Science #Reproducibility #DataLeakage: "To minimize errors in ML-based science, ...
#ML #Science #Reproducibility #DataLeakage: "To minimize errors in ML-based science, and to make it more apparent when errors do creep in, we propose REFORMS (Reporting standards for Machine Learning Based Science) in a preprint released today. It is a checklist of 32 items that can be helpful for researchers conducting ML-based science, referees reviewing it, and journals where it is submitted and published.
The checklist was developed by a consensus of 19 researchers across computer science, data science, social sciences, mathematics, and biomedical research. The disciplinary diversity of the authors was essential to ensure that the standards are useful across many fields. A majority of the authors were speakers or organizers at a workshop we organized last year titled "The Reproducibility Crisis in ML-Based Science." (Videos of the talks and discussions are available on the workshop page.)
The checklist and the paper introducing it are available on our project website. The paper also provides a review of past failures, as well as best practices for avoiding such failures."
https://www.aisnakeoil.com/p/introducing-the-reforms-checklistPublished at
2023-08-18 20:12:57Event JSON
{
"id": "39b3e13aea399f54ad2c7a172a695cf9f6fb9710af6b46af4cecaae544b57c00",
"pubkey": "0bb8cfad2c4ef2f694feb68708f67a94d85b29d15080df8174b8485e471b6683",
"created_at": 1692389577,
"kind": 1,
"tags": [
[
"t",
"ml"
],
[
"t",
"science"
],
[
"t",
"reproducibility"
],
[
"t",
"dataleakage"
],
[
"proxy",
"https://tldr.nettime.org/users/remixtures/statuses/110912443321400803",
"activitypub"
]
],
"content": "#ML #Science #Reproducibility #DataLeakage: \"To minimize errors in ML-based science, and to make it more apparent when errors do creep in, we propose REFORMS (Reporting standards for Machine Learning Based Science) in a preprint released today. It is a checklist of 32 items that can be helpful for researchers conducting ML-based science, referees reviewing it, and journals where it is submitted and published. \n\nThe checklist was developed by a consensus of 19 researchers across computer science, data science, social sciences, mathematics, and biomedical research. The disciplinary diversity of the authors was essential to ensure that the standards are useful across many fields. A majority of the authors were speakers or organizers at a workshop we organized last year titled \"The Reproducibility Crisis in ML-Based Science.\" (Videos of the talks and discussions are available on the workshop page.) \n\nThe checklist and the paper introducing it are available on our project website. The paper also provides a review of past failures, as well as best practices for avoiding such failures.\" \n\nhttps://www.aisnakeoil.com/p/introducing-the-reforms-checklist",
"sig": "64ef39a82408398bffe0212966ad03abd664fa6e854a4d2265b05ce987cd821ef7d9399b843f391f30fd1d7954fd94627ed6a591696ca3ad41d4d8ab3c222430"
}