Connect with us

Hi, what are you looking for?

AI Generative

Tonic Textual Reveals LLM-Based Annotation Workflow Achieving 0.70 F1 with Minimal Human Input

Tonic Textual unveils a groundbreaking LLM-based annotation workflow achieving a 0.71 F1 score with just ten human inputs, revolutionizing NER efficiency.

In a breakthrough that could reshape the landscape of natural language processing (NLP), Tonic Textual has introduced a custom entity workflow that significantly reduces the costs associated with human annotation in named entity recognition (NER). This innovation comes at a time when the demand for high-quality training data is growing, yet the traditional methods of data annotation remain slow and expensive.

The NCBI Disease Corpus serves as a prime example of the challenges involved in building quality training sets. Developed over two summers by 14 annotators with biomedical informatics backgrounds, the corpus required painstaking efforts to label 793 abstracts from PubMed, with each document independently reviewed to ensure accuracy. The labor-intensive process illustrates the broader issue facing NER: recruiting domain experts and managing multi-annotator projects can be prohibitively costly, especially in specialized fields like healthcare and finance.

Tonic Textual’s approach aims to streamline this process by leveraging large language models (LLMs) to automate the annotation phase. By writing clear annotation guidelines and uploading a small validation set of ground truth labels, practitioners can refine their instructions and let the LLM handle the bulk of the data annotation. This innovative method allows for rapid processing of thousands of documents, thereby addressing the traditional bottleneck in NER.

The effectiveness of this method was put to the test using the NCBI Disease Corpus. In a controlled experiment, Tonic Textual found that LLM annotations yielded a model with an F1 score of 0.71 against ground truth labels, even with no human-labeled training data. This score improved incrementally as more human labels were mixed in, reaching an F1 score of 0.81 with complete human annotations. However, the returns diminished with each additional document, signaling that the core value lies in the effective guidelines rather than extensive human labeling.

When compared with other approaches, Tonic Textual’s custom entity workflow demonstrated a significant advantage. While GLiNER2, a zero-shot NER model, achieved a low recall rate of 0.26 despite good precision, Tonic’s method excelled with minimal human input. A mere ten human-labeled validation examples allowed the model to achieve an F1 score of 0.71, underscoring the potential of this new annotation strategy.

Despite the success, a lingering question remains regarding the limitations of using general-purpose models such as RoBERTa. The authors acknowledge that specialized biomedical models could close the performance gap, which currently sits between 0.70 and 0.81 F1. The polysemy of gene and disease names presents a unique challenge, as abbreviations often refer to both, complicating the annotation process.

In a subsequent case study involving healthcare identifiers from electronic health records (EHR), Tonic Textual further reinforced the efficacy of its approach. The workflow began with a validation set of 123 documents and expanded to a training set of 1,119 documents, all annotated by the LLM. Iterative refinement of guidelines led to a final model achieving an impressive F1 score of 0.947, surpassing the production release threshold of 0.914. Notably, this was accomplished with no human-labeled training data, emphasizing the potential for rapid and cost-effective deployment of NER models.

As Tonic Textual continues to break down the barriers posed by traditional annotation processes, the implications for industries relying on NER are profound. The workflow compresses weeks of labor-intensive tasks into a matter of hours, allowing organizations to shift their focus from data collection to refining their understanding of what they seek to extract. With the ability to produce production-ready models swiftly, Tonic Textual is poised to change how practitioners approach NER, making high-quality data annotation accessible and efficient.

The implications of this advancement resonate across sectors that utilize NER technology. As organizations look to improve their data extraction capabilities without incurring prohibitive costs, Tonic Textual’s custom entity workflow offers a promising solution to a longstanding challenge in the NLP space.

See also
Staff
Written By

The AiPressa Staff team brings you comprehensive coverage of the artificial intelligence industry, including breaking news, research developments, business trends, and policy updates. Our mission is to keep you informed about the rapidly evolving world of AI technology.

You May Also Like

AI Tools

Discover 39 innovative AI tools like Copy.ai and Jasper that boost productivity and creativity, transforming workflows for professionals across industries.

AI Technology

Moltbook launches an innovative AI chatroom prioritizing real-time interactions, raising critical privacy concerns as users explore its dynamic conversation capabilities

Top Stories

Anthropic launches Claude for Healthcare, aiming to streamline workflows and potentially unlock $110 billion in annual value by automating administrative tasks.

Top Stories

Microsoft's BioGPT records 45,315 monthly downloads and achieves 78.2% accuracy on PubMedQA, revolutionizing biomedical natural language processing.

AI Marketing

Autoblogging.ai launches an AI-driven content suite for SEO, serving over 40,000 users and achieving traffic gains of over 600% for businesses globally

AI Finance

Arab Bank and Banco do Brasil revolutionize banking with AI solutions, enhancing lead generation and compliance through over 700 models and advanced data analytics.

AI Research

CMU-Q launches a Bachelor of Science in AI to bridge the inclusion gap for 400M Arabic speakers, addressing dialect challenges and ethical AI deployment.

Top Stories

Hugging Face accelerates NLP applications in market analysis and customer service, enhancing insights and response times with advanced models like GPT and BERT.

© 2025 AIPressa · Part of Buzzora Media · All rights reserved. This website provides general news and educational content for informational purposes only. While we strive for accuracy, we do not guarantee the completeness or reliability of the information presented. The content should not be considered professional advice of any kind. Readers are encouraged to verify facts and consult appropriate experts when needed. We are not responsible for any loss or inconvenience resulting from the use of information on this site. Some images used on this website are generated with artificial intelligence and are illustrative in nature. They may not accurately represent the products, people, or events described in the articles.