AI language models fail to produce an appropriate early diagnosis more than 80% of the time, suggesting they are not yet safe for unsupervised clinical use, according to a new study.