The rate at which AI language models are spreading and changing is an incredibly fast pace. This libguide was created in late February of 2023. Information contained within this libguide might be out-of-date by the time you are viewing this content.
"Large Language Models, such as GPT-3, are trained on vast amounts of text data from the internet and are capable of generating human-like text, but they may not always produce output that is consistent with human expectations or desirable values. In fact, their objective function is a probability distribution over word sequences that allows them to predict what the next word is in a sequence." - Assembly AI
"Internally ChatGPT uses a combination of machine learning algorithms and deep learning techniques to process and generate text responses. When a user inputs a message into the chat, the system first tokenizes the text, which involves breaking down the words and sentences into individual units. The tokens are then passed through a series of layers, which include the encoder and decoder layers, to generate a response.
One of the key technical details of ChatGPT’s internal architecture is its use of attention mechanisms. Attention mechanisms allow the model to focus on specific parts of the input text, which helps it generate more relevant and contextually accurate responses. This is particularly important in the context of a conversation, where previous messages need to be considered when generating a response
Another technical detail of ChatGPT’s internal architecture is its use of memory modules. These modules allow the model to retain information from previous messages, which helps it generate more coherent and consistent responses. This is especially useful in longer conversations, where the model needs to maintain a sense of context and coherence." - Unimedia
Sections of this LibGuide were copied and/or adapted with permission from Black Hawk College Library's page titled ChatGPT by Atticus Garrison.