pubmed-article:16076556 | pubmed:abstractText | The inter- and intraobserver agreement (K statistic) in reporting according to BI-RADS assessment categories was tested on 12 dedicated breast radiologists, with little prior working knowledge of BI-RADS, reading a set of 50 lesions (29 malignant, 21 benign). Intraobserver agreement (four categories: R2, R3, R4, R5) was fair (0.21-0.40), moderate (0.41-0.60), substantial (0.61-0.80) or almost perfect (>0.80) for one, two, five or four radiologists, or (six categories: R2, R3, R4a, R4b, R4c, R5) fair, moderate, substantial or almost perfect for three, three, three or three radiologists, respectively. Interobserver agreement (four categories) was fair, moderate or substantial for three, six, or three radiologists, or (six categories) slight, fair or moderate for one, six, or five radiologists. Major disagreement occurred for intermediate categories (R3=0.12, R4=0.25, R4a=0.08, R4b=0.07, R4c=0.10). We found insufficient intra- and interobserver consistency of breast radiologists in reporting BI-RADS assessment categories. Although training may improve these results, simpler alternative reporting methods (systems), focused on clinical decision-making, should be explored. | lld:pubmed |