Page 94 - AI Governance Day - From Principles to Implementation
P. 94
AI Governance Day - From Principles to Implementation
Appendix 1: Essential vocabulary for AI governance
"AI lifecycle" refers to the entire process of AI development and deployment, broken down
into distinct stages. The stages typically include:
– Design: This initial phase involves conceptualizing and designing the AI model or system
based on specific needs and objectives.
– Training: During training, the designed models learn from vast amounts of data to
develop the ability to perform tasks such as recognizing patterns or making decisions.
– Enhancement: After training, AI systems may undergo further refinements and
enhancements to improve their accuracy, efficiency, and performance.
– Deployment: In the final stage, the AI system is deployed in a real-world environment to
perform the tasks it was designed for.
A machine learning or AI model, particularly a neural network, can have billions or even trillions
of parameters. The number of parameters is often added to the name of the model, e.g.
"<name> 670B" means that the model has 670 billion parameters. The most advanced AI
models are called "frontier AI models."
The number of parameters is an important factor in a model's performance, but it is not the
only factor. Other factors include the quality of the training data and the model architecture.
Each parameter has a numerical value. Parameters are also often referred to as weights. During
the training process, the machine learning algorithm adjusts these parameters to minimize the
difference between its predictions and the actual outcomes. This process is repeated many
times and requires processing enormous amounts of data.
The final set of weights, obtained after training, is what gives the machine learning model its
predictive power.
Once a model has been trained on a dataset, it is ready for deployment in the real world: it is
ready for inference, i.e. to infer/deduce new content. The trained model applies what it has
learned to make predictions on new, unseen data. For example, each time you enter a prompt
into a chatbot, the chatbot generates a response based on its training – this is an inference.
Every prompt leads to another inference.
The training process requires intensive compute, i.e. computing resources, and for frontier
AI models tend to take months on a complex computing infrastructure involving specialized
computer chips. In contrast, running a single inference query (e.g. having an AI model respond
to a single question) requires much less compute, but the total amount of compute used for
inference is still very large, since large AI companies need to run millions of user queries per
day.
"Open source" commonly refers to software that is made available with its source code
accessible to anyone, allowing anyone to inspect, modify and distribute the software. There is
not yet a common terminology as to what "open source" means in the context of AI models:
companies might release just the source code for training their AI model, or include the weights,
or even provide the training data, or they may have restrictions attached to their release. Open-
sourcing most typically, though imprecisely, refers to publication of model weights.
84