news-header.jpg

News

Subscribe rss.gif

Recent News

Archived

Notes from Executive Director Scott Garrison – June 2024

Continuing from my May MCLS newsletter piece on artificial intelligence (AI), this month I’ll recap Western Michigan University (WMU) Libraries Associate Dean Mary O’Kelly’s excellent presentation on using AI tools for research and writing at the recent Michigan Academic Library Association (MiALA) conference. Many facets of AI and how various industries are using AI tools show up in the news on a daily basis, and O’Kelly noted that current AI tools are affecting every aspect of education. In an interactive presentation that included poll questions she asked the audience via Mentimeter (which itself can group responses using AI), O’Kelly focused on three complementary AI tools that, used together, can search, analyze, and draft research articles: 

  1. Litmaps allows us to search Google Scholar for a topic and creates a mindmap connecting a set of related articles that shows number of references, number of times cited, and date for each. To reorient around an article’s references or citations, simply click on that article in the map and then Litmap’s references or citations function in the article brief summary view. Litmap’s color-coded tagging feature allows us to see articles we’ve tagged as we shift our view around a related set of articles via references or citations (where relevant). Litmaps also links out to article full text (subscription required in many cases), and can export a list of tagged references. Litmaps uses AI to analyze content for connections and pulls article metadata from Crossref, Semantic Scholar, and OpenAlex. 
  2. Elicit allows us to ask a research question and summarizes a set of articles in a top paragraph with citations. It displays the set in a table that can include user-selectable columns for basic citation information, summary, main findings, methodology, summary of discussion, and more. Elicit can also analyze PDF files the user uploads to it. Like Litmaps, Elicit searches academic papers from Semantic Scholar. 
  3. Once we’ve found relevant research articles, we can upload or paste full text and data into Claude, which can write an essay that compares and summarizes the information we provide it (and we can copy and paste the summary into other software for further editing). Claude can also create a bibliography in our preferred style. O’Kelly considers Claude the most powerful text analysis tool available. It was built by Anthropic (former OpenAI employees) to be more trustworthy and reliable than other tools such as ChatGPT.  

Having experimented a bit with ChatGPT and then Claude, I got curious about how each was trained. When I asked ChatGPT how it is trained, it replied: 

“I’ve been trained on a diverse range of internet text, including books, articles, websites, and other sources, to develop a broad understanding of human language. My training involves processing vast amounts of text data and learning the patterns and structures inherent in language, allowing me to generate responses to various prompts. If you have any specific questions about my training or how I work, feel free to ask!” 

When I asked Claude how it is trained, it replied: 

“I was trained using constitutional AI, which involves using machine learning techniques to align my behaviors with certain guidelines or principles during the training process. From what I understand, the key aspects of my training approach were: 

  1. Oversight and Curation: The training data I ingested was carefully curated and filtered by the Anthropic team to remove potential sources of bias, toxicity, or misinformation. 
  2. Constitutional Rules: A set of key principles around being helpful, harmless, honest, ethical, and protecting individual privacy were encoded into the reinforcement learning that shaped my behavior. 
  3. Multi-Task Learning: In addition to open-ended dialogue, I was trained on a wide variety of tasks like analysis, coding, math, and question-answering using a diverse corpus of high-quality data sources. 
  4. Iterative Refinement: My outputs were continuously evaluated and my training was refined through techniques like reinforcement learning and recursive reward modeling to better align me with Anthropic’s intended principles. 
  5. Knowledge Cutoff: My training data cutoff in August 2023, giving me a stable and consistent knowledge base that doesn’t shift over time. 

However, many of the technical details like model architecture, computer resources used, hyperparameters, and the precise data sources involved are not known to me. I aim to be as transparent as I can about my training while respecting the intellectual property of Anthropic. Please let me know if you have any other questions!”  

Noting Claude’s transparency and being curious for more about ChatGPT, I also asked Claude how ChatGPT was trained. It replied:  

“I don’t have insider knowledge about the specific training process used for ChatGPT, as that is proprietary information held by OpenAI. However, based on publicly available information, we know some key points about how ChatGPT was trained: 

  1. Constitutional AI: ChatGPT was trained using an approach called “constitutional AI” developed by researchers at OpenAI. This involves using reinforcement learning to instill desired behavior in alignment with set of rules or guidelines. 
  2. Web Data: A substantial portion of ChatGPT’s training data comes from crawling and ingesting a massive corpus of text data from the internet, including websites, books, articles, and other online content. 
  3. Supervised Fine-Tuning: After pretraining on the broad web data, ChatGPT went through additional supervised fine-tuning on smaller, curated datasets to enhance its abilities in areas like question-answering and dialogue. 
  4. Reinforcement Learning: Techniques like reinforcement learning from human feedback were used to refine ChatGPT’s outputs to be more relevant, truthful and harmless. 
  5. Model Scale: ChatGPT leverages an extremely large neural network model with over 175 billion parameters, allowing it to ingest and learn patterns from a staggering amount of training data. 

However, many details about the model architecture, computer resources used, hyperparameters, techniques to instill certain capabilities, and the exact datasets involved remain closely guarded by OpenAI. Their published research only discloses high-level methodologies. But this outlines the general training approach at a high level from publicly available information. Let me know if you need any other perspective on this topic!” 

When I asked Claude to compare its own training with ChatGPT’s, it told me the following based on publicly available information, which I’ll summarize here: 

  1. ChatGPT was primarily trained through crawling a large number of Web pages, and Claude’s training was curated and filtered by Anthropic. 
  2. Both were trained using constitutional AI, with Claude’s training instilling principles around being helpful, harmless, honest, and ethical. 
  3. While ChatGPT’s training data is updated periodically, Claude has a training cutoff of August 2023, which it says provides stable and consistent knowledge. 
  4. Claude has undergone multi-task training across areas like analysis, coding, math, and question-answering, and to Claude, ChatGPT’s training “seems more focused on general language skills and open-domain knowledge.” 
  5. Finally, whereas ChatGPT uses a 175 billion-parameter model, Claude’s model “is comparatively smaller in scale.” 

Although Claude can hallucinate like other LLMs do, users can upload the exact text they want to analyze, minimizing errors. Claude also has been programmed to say, “I don’t know,” and to respond only if confident in its response, if prompted to do so. Claude can analyze pasted text and uploaded Excel .cvs files, Word documents, PDFs, and .txt, .rtf, and .epub files. It can also be embedded into Google Sheets to provide even more data analysis help. ChatGPT told me that I could paste text into a conversation with it, or provide it links to publicly available content.   

O’Kelly briefly mentioned some other AI-based tools such as the Packback writing/grading assistant, Clarivate’s Alethea deep reading tutor (see also a May 21 press release for more about Clarivate’s AI strategy), and ResearchRabbit, and noted that Zotero now has ARIA, an AI-based plugin. She also touched on what some tools purport to do versus actual mixed results, and some tools’ false alarms when trying to detect AI in student work. In closing, O’Kelly emphasized that vendors are putting AI into their systems that libraries and their users will need to understand, and that faculty and students are using tools like she described. She asked the audience to consider what all of this means in terms of library discovery, how users use library collections, and the library’s role. O’Kelly suggested that session participants watch for more on the paradoxes of AI in academic libraries from a chapter she is writing for the forthcoming book AI for Academic Libraries: Navigating the Future. 

My thanks to Mary O’Kelly for her enlightening presentation, and her guidance on this summary.  

What are you doing with AI currently, and what do you think about it? Let me know, at garrisons@mcls.org.