DeepMind Technologies Limited

United Kingdom

1-100 of 788 for DeepMind Technologies Limited

Sort by

Query


Aggregations
IP Type
Patent	732
Trademark	56

Jurisdiction
United States	432
World	292
Canada	38
Europe	26

Date
New (last 4 weeks)	37
2024 April (MTD)	27
2024 March	20
2024 February	13
2024 January	7
2023 December	12
2024 (YTD)	69
2023	138
2022	116
2021	114
2020	117
2019	84
Before 2019	150
See more See less
IPC Class
G06N 3/08 - Learning methods	434
G06N 3/04 - Architecture, e.g. interconnection topology	386
G06N 3/00 - Computing arrangements based on biological models	98
G06N 3/045 - Combinations of networks	83
G06K 9/62 - Methods or arrangements for recognition using electronic means	61
G06N 20/00 - Machine learning	46
G06N 3/084 - Backpropagation, e.g. using gradient descent	44
G06N 3/092 - Reinforcement learning	44
G06N 3/044 - Recurrent networks, e.g. Hopfield networks	29
G06N 7/00 - Computing arrangements based on specific mathematical models	29
G06N 3/02 - Neural networks	28
G16B 15/20 - Protein or domain folding	26
G16B 40/20 - Supervised data analysis	26
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks	23
G06F 17/18 - Complex mathematical operations for evaluating statistical data	22
G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]	22
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks	20
G06K 9/00 - Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints	19
G06N 3/047 - Probabilistic or stochastic networks	18
G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means	18
G06F 17/16 - Matrix or vector computation	16
G06K 9/46 - Extraction of features or characteristics of the image	15
G06N 3/0464 - Convolutional networks [CNN, ConvNet]	14
B25J 9/16 - Programme controls	13
G06F 16/901 - Indexing; Data structures therefor; Storage structures	12
G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks	11
G06N 3/088 - Non-supervised learning, e.g. competitive learning	11
G16B 15/30 - Drug targeting using structural data; Docking or binding prediction	11
G06N 5/04 - Inference or reasoning models	10
G06N 5/00 - Computing arrangements using knowledge-based models	9
See more See less
NICE Class
42 - Scientific, technological and industrial services, research and design	54
41 - Education, entertainment, sporting and cultural services	49
09 - Scientific and electric apparatus and instruments	47
16 - Paper, cardboard and goods made from these materials	34
28 - Games; toys; sports equipment	34
35 - Advertising and business services	19
10 - Medical apparatus and instruments	12
44 - Medical, veterinary, hygienic and cosmetic services; agriculture, horticulture and forestry services	10
07 - Machines and machine tools	7
12 - Land, air and water vehicles; parts of land vehicles	7
38 - Telecommunications services	6
05 - Pharmaceutical, veterinary and sanitary products	3
36 - Financial, insurance and real estate services	1
See more See less
Status
Pending	215
Registered / In Force	573

1 2 3 ... 8 Next Page

1. LEVERAGING OFFLINE TRAINING DATA AND AGENT COMPETENCY MEASURES TO IMPROVE ONLINE LEARNING

Application Number	18492415
Status	Pending
Filing Date	2023-10-22
First Publication Date	2024-04-25
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Wen, Zheng Van Roy, Benjamin Jain, Rahul Anant Hao, Botao

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a target action selection policy to control a target agent interacting with an environment. In one aspect, a method comprises: obtaining a set of offline training data, wherein the offline training data characterizes interaction of a baseline agent with an environment as the baseline agent performs actions selected in accordance with a baseline action selection policy; generating a set of online training data that characterizes interaction of the target agent with the environment as the target agent performs actions selected in accordance with the target action selection policy; and training the target action selection policy on both: (i) the offline training data, and (ii) the online training data, wherein the training of the target action selection policy on the offline training data is conditioned on a measure of competency of the baseline agent.

IPC Classes ?

G06N 3/092 - Reinforcement learning

2. SELECTIVE ACQUISITION FOR MULTI-MODAL TEMPORAL DATA

Application Number	EP2023079389
Publication Number	2024/084097
Status	In Force
Filing Date	2023-10-21
Publication Date	2024-04-25
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Kossen, Jannik Lukas Belgrave, Danielle Charlotte Mary Tomasev, Nenad Cangea, Catalina-Codruta Ktena, Sofia Ira Vértes, Eszter Patraucean, Viorica Jaegle, Andrew Coulter

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a prediction characterizing an environment. In one aspect, a method includes obtaining a respective observation characterizing a state of an environment for each time step in a sequence of multiple time steps, comprising, for each time step after a first time step in the sequence of time steps: processing a network input that comprises observations obtained for one or more preceding time steps to generate a plurality of acquisition decisions; obtaining an observation for the time step, wherein the observation includes data corresponding to modalities that are selected for acquisition at the time step, does not include data corresponding to modalities that are not selected for acquisition at the time step; and processing a model input that includes the observation for each time step in the sequence of time steps to generate the prediction.

IPC Classes ?

G06N 3/045 - Combinations of networks
G06N 3/092 - Reinforcement learning
G06N 5/022 - Knowledge engineering; Knowledge acquisition

3. GENERATING AUDIO USING NEURAL NETWORKS

Application Number	18519986
Status	Pending
Filing Date	2023-11-27
First Publication Date	2024-04-25
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Van Den Oord, Aaron Gerard Antonius Dieleman, Sander Etienne Lea Kalchbrenner, Nal Emmerich Simonyan, Karen Vinyals, Oriol

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of audio data that comprises a respective audio sample at each of a plurality of time steps. One of the methods includes, for each of the time steps: providing a current sequence of audio data as input to a convolutional subnetwork, wherein the current sequence comprises the respective audio sample at each time step that precedes the time step in the output sequence, and wherein the convolutional subnetwork is configured to process the current sequence of audio data to generate an alternative representation for the time step; and providing the alternative representation for the time step as input to an output layer, wherein the output layer is configured to: process the alternative representation to generate an output that defines a score distribution over a plurality of possible audio samples for the time step.

IPC Classes ?

G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/045 - Combinations of networks
G06N 3/048 - Activation functions
G10L 13/06 - Elementary speech units used in speech synthesisers; Concatenation rules

4. DISTRIBUTIONAL REINFORCEMENT LEARNING USING QUANTILE FUNCTION NEURAL NETWORKS

Application Number	18542476
Status	Pending
Filing Date	2023-12-15
First Publication Date	2024-04-25
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Ostrovski, Georg Dabney, William Clinton

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting an action to be performed by a reinforcement learning agent interacting with an environment. In one aspect, a method comprises: receiving a current observation; for each action of a plurality of actions: randomly sampling one or more probability values; for each probability value: processing the action, the current observation, and the probability value using a quantile function network to generate an estimated quantile value for the probability value with respect to a probability distribution over possible returns that would result from the agent performing the action in response to the current observation; determining a measure of central tendency of the one or more estimated quantile values; and selecting an action to be performed by the agent in response to the current observation using the measures of central tendency for the actions.

IPC Classes ?

G06N 3/08 - Learning methods
G06N 3/04 - Architecture, e.g. interconnection topology

5. FAST EXPLORATION AND LEARNING OF LATENT GRAPH MODELS

Application Number	18373870
Status	Pending
Filing Date	2023-09-27
First Publication Date	2024-04-18
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Swaminathan, Sivaramakrishnan Dave, Meet Kirankumar Lazaro-Gredilla, Miguel George, Dileep

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a graph model representing an environment being interacted with by an agent. In one aspect, one of the methods include: obtaining experience data; using the experience data to update a visitation count for each of one or more state-action pairs represented by the graph model; and at each of multiple environment exploration steps: computing a utility measure for each of the one or more state-action pairs represented by the graph model; determining, based on the utility measures, a sequence of one or more planned actions that have an information gain that satisfies a threshold; and controlling the agent to perform the sequence of one or more planned actions to cause the environment to transition from a state characterized by a last observation received after a last action in the experience data into a different state.

IPC Classes ?

G06F 16/901 - Indexing; Data structures therefor; Storage structures
G06F 17/12 - Simultaneous equations

6. GENERATING A MODEL OF A TARGET ENVIRONMENT BASED ON INTERACTIONS OF AN AGENT WITH SOURCE ENVIRONMENTS

Application Number	18379988
Status	Pending
Filing Date	2023-10-13
First Publication Date	2024-04-18
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Bellot, Alexis Malek, Alan John Chiappa, Silvia

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for selecting actions for an agent in a target environment. In particular, the actions are selected using an environment model for the target environment that is parameterized using interactions of the agent with the target environment and one or more source environments.

IPC Classes ?

G06F 30/20 - Design optimisation, verification or simulation
G06F 16/901 - Indexing; Data structures therefor; Storage structures

7. OPTIMIZING ALGORITHMS FOR HARDWARE DEVICES

Application Number	17959210
Status	Pending
Filing Date	2022-10-03
First Publication Date	2024-04-18
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Hubert, Thomas Keisuke Huang, Shih-Chieh Novikov, Alexander Fawzi, Alhussein Romera-Paredes, Bernardino Silver, David Hassabis, Demis Swirszcz, Grzegorz Michal Schrittwieser, Julian Kohli, Pushmeet Barekatain, Mohammadamin Balog, Matej Rodriguez Ruiz, Francisco Jesus

Abstract

A method performed by one or more computers for obtaining an optimized algorithm that (i) is functionally equivalent to a target algorithm and (ii) optimizes one or more target properties when executed on a target set of one or more hardware devices. The method includes: initializing a target tensor representing the target algorithm; generating, using a neural network having a plurality of network parameters, a tensor decomposition of the target tensor that parametrizes a candidate algorithm; generating target property values for each of the target properties when executing the candidate algorithm on the target set of hardware devices; determining a benchmarking score for the tensor decomposition based on the target property values of the candidate algorithm; generating a training example from the tensor decomposition and the benchmarking score; and storing, in a training data store, the training example for use in updating the network parameters of the neural network.

IPC Classes ?

G06N 3/08 - Learning methods
G06N 3/063 - Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

8. NEURAL NETWORKS WITH ADAPTIVE GRADIENT CLIPPING

Application Number	18275087
Status	Pending
Filing Date	2022-02-02
First Publication Date	2024-04-18
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Brock, Andrew De, Soham Smith, Samuel Laurence Simonyan, Karen

Abstract

There is disclosed a computer-implemented method for training a neural network. The method comprises determining a gradient associated with a parameter of the neural network. The method further comprises determining a ratio of a gradient norm to parameter norm and comparing the ratio to a threshold. In response to determining that the ratio exceeds the threshold, the value of the gradient is reduced such that the ratio is equal to or below the threshold. The value of the parameter is updated based upon the reduced gradient value.

IPC Classes ?

G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/776 - Validation; Performance evaluation

9. META-LEARNED EVOLUTIONARY STRATEGIES OPTIMIZER

Application Number	18475859
Status	Pending
Filing Date	2023-09-27
First Publication Date	2024-04-18
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Lange, Robert Tjarko Schaul, Tom Chen, Yutian Zahavy, Tom Ben Zion Dalibard, Valentin Clement Lu, Christopher Yenchuan Baveja, Satinder Singh Flennerhag, Johan Sebastian

Abstract

There is provided a computer-implemented method for updating a search distribution of an evolutionary strategies optimizer using an optimizer neural network comprising one or more attention blocks. The method comprises receiving a plurality of candidate solutions, one or more parameters defining the search distribution that the plurality of candidate solutions are sampled from, and fitness score data indicating a fitness of each respective candidate solution of the plurality of candidate solutions. The method further comprises processing, by the one or more attention neural network blocks, the fitness score data using an attention mechanism to generate respective recombination weights corresponding to each respective candidate solution. The method further comprises updating the one or more parameters defining the search distribution based upon the recombination weights applied to the plurality of candidate solutions.

IPC Classes ?

G06N 3/086 - Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming

10. DISTRIBUTED TRAINING USING ACTOR-CRITIC REINFORCEMENT LEARNING WITH OFF-POLICY CORRECTION FACTORS

Application Number	18487428
Status	Pending
Filing Date	2023-10-16
First Publication Date	2024-04-18
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Soyer, Hubert Josef Espeholt, Lasse Simonyan, Karen Doron, Yotam Firoiu, Vlad Mnih, Volodymyr Kavukcuoglu, Koray Munos, Remi Ward, Thomas Harley, Timothy James Alexander Dunning, Iain Robert

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network used to select actions to be performed by an agent interacting with an environment. In one aspect, a system comprises a plurality of actor computing units and a plurality of learner computing units. The actor computing units generate experience tuple trajectories that are used by the learner computing units to update learner action selection neural network parameters using a reinforcement learning technique. The reinforcement learning technique may be an off-policy actor critic reinforcement learning technique.

IPC Classes ?

G06N 3/08 - Learning methods
G06N 3/045 - Combinations of networks

11. PATHOGENICITY PREDICTION FOR PROTEIN MUTATIONS USING AMINO ACID SCORE DISTRIBUTIONS

Application Number	EP2023078227
Publication Number	2024/079204
Status	In Force
Filing Date	2023-10-11
Publication Date	2024-04-18
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Avsec, Ziga Novati, Guido Cheng, Jun

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a pathogenicity score characterizing a likelihood that a mutation to a protein is a pathogenic mutation, wherein the mutation modifies an amino acid sequence of the protein by replacing an original amino acid by a substitute amino acid at a mutation position in the amino acid sequence of the protein. In one aspect, a method comprises: generating a network input to a pathogenicity prediction neural network, wherein the network input comprises a multiple sequence alignment (MSA) representation that represents an MSA for the protein; processing the network input using the pathogenicity prediction neural network to generate a score distribution over a set of amino acids; and generating the pathogenicity score using the score distribution over the set of amino acids.

IPC Classes ?

G16B 15/20 - Protein or domain folding
G16B 20/50 - Mutagenesis
G16B 30/10 - Sequence alignment; Homology search
G16B 40/20 - Supervised data analysis

12. PREDICTING PROTEIN AMINO ACID SEQUENCES USING GENERATIVE MODELS CONDITIONED ON PROTEIN STRUCTURE EMBEDDINGS

Application Number	18275933
Status	Pending
Filing Date	2022-01-27
First Publication Date	2024-04-11
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Senior, Andrew W. Kohl, Simon Yim, Jason Bates, Russell James Ionescu, Catalin-Dumitru Nash, Charlie Thomas Curtis Razavi-Nematollahi, Ali Pritzel, Alexander Jumper, John

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing protein design. In one aspect, a method comprises: processing an input characterizing a target protein structure of a target protein using an embedding neural network having a plurality of embedding neural network parameters to generate an embedding of the target protein structure of the target protein; determining a predicted amino acid sequence of the target protein based on the embedding of the target protein structure, comprising: conditioning a generative neural network having a plurality of generative neural network parameters on the embedding of the target protein structure; and generating, by the generative neural network conditioned on the embedding of the target protein structure, a representation of the predicted amino acid sequence of the target protein.

IPC Classes ?

G16B 15/00 - ICT specially adapted for analysing two-dimensional or three-dimensional molecular structures, e.g. structural or functional relations or structure alignment
G16B 30/20 - Sequence assembly
G16B 40/20 - Supervised data analysis

13. DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS

Application Number	18374447
Status	Pending
Filing Date	2023-09-28
First Publication Date	2024-04-11
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Strudel, Robin Leblond, Rémi Sifre, Laurent Dieleman, Sander Etienne Lea Savinov, Nikolay Grathwohl, Will S. Tallec, Corentin Altché, Florent Ganin, Iaroslav Mensch, Arthur Du, Yilin

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating an output sequence of discrete tokens using a diffusion model. In one aspect, a method includes generating, by using the diffusion model, a final latent representation of the sequence of discrete tokens that includes a determined value for each of a plurality of latent variables; applying a de-embedding matrix to the final latent representation of the output sequence of discrete tokens to generate a de-embedded final latent representation that includes, for each of the plurality of latent variables, a respective numeric score for each discrete token in a vocabulary of multiple discrete tokens; selecting, for each of the plurality of latent variables, a discrete token from among the multiple discrete tokens in the vocabulary that has a highest numeric score; and generating the output sequence of discrete tokens that includes the selected discrete tokens.

IPC Classes ?

G06N 3/045 - Combinations of networks

14. PROGRESSIVE NEURAL NETWORKS

Application Number	18479775
Status	Pending
Filing Date	2023-10-02
First Publication Date	2024-04-11
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Rabinowitz, Neil Charles Desjardins, Guillaume Rusu, Andrei-Alexandru Kavukcuoglu, Koray Hadsell, Raia Thais Pascanu, Razvan Kirkpatrick, James Soyer, Hubert Josef

Abstract

Methods and systems for performing a sequence of machine learning tasks. One system includes a sequence of deep neural networks (DNNs), including: a first DNN corresponding to a first machine learning task, wherein the first DNN comprises a first plurality of indexed layers, and each layer in the first plurality of indexed layers is configured to receive a respective layer input and process the layer input to generate a respective layer output; and one or more subsequent DNNs corresponding to one or more respective machine learning tasks, wherein each subsequent DNN comprises a respective plurality of indexed layers, and each layer in a respective plurality of indexed layers with index greater than one receives input from a preceding layer of the respective subsequent DNN, and one or more preceding layers of respective preceding DNNs, wherein a preceding layer is a layer whose index is one less than the current index.

IPC Classes ?

G06N 3/045 - Combinations of networks
G06F 17/16 - Matrix or vector computation
G06N 3/08 - Learning methods

15. EVALUATING REPRESENTATIONS WITH READ-OUT MODEL SWITCHING

Application Number	18475972
Status	Pending
Filing Date	2023-09-27
First Publication Date	2024-04-11
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Li, Yazhe Bornschein, Jorg Hutter, Marcus

Abstract

A method of automatically selecting a neural network from a plurality of computer-implemented candidate neural networks, each candidate neural network comprising at least an encoder neural network trained to encode an input value as a latent representation. The method comprises: obtaining a sequence of data items, each of the data items comprising an input value and a target value; and determining a respective score for each of the candidate neural networks, comprising evaluating the encoder neural network of the candidate neural network using a plurality of read-out heads. Each read-out head comprises parameters for predicting a target value from a latent representation of an input value of a data item encoded using the encoder neural network of the candidate neural network. The method further comprises selecting the neural network from the plurality of candidate neural networks using the respective scores.

IPC Classes ?

G06N 3/092 - Reinforcement learning

16. OPTIMIZING ALGORITHMS FOR HARDWARE DEVICES

Application Number	EP2023077237
Publication Number	2024/074452
Status	In Force
Filing Date	2023-10-02
Publication Date	2024-04-11
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Hubert, Thomas Keisuke Huang, Shih-Chieh Novikov, Alexander Fawzi, Alhussein Romera-Paredes, Bernardino Silver, David Hassabis, Demis Swirszcz, Grzegorz Michal Schrittwieser, Julian Kohli, Pushmeet Barekatain, Mohammadamin Balog, Matej Rodriguez Ruiz, Francisco Jesus

Abstract

IPC Classes ?

G06F 16/901 - Indexing; Data structures therefor; Storage structures
G06N 3/092 - Reinforcement learning
G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn
G06N 5/01 - Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

17. CONTROLLING AGENTS USING REPORTER NEURAL NETWORKS

Application Number	EP2023076516
Publication Number	2024/068610
Status	In Force
Filing Date	2023-09-26
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Dasgupta, Ishita Chen, Shiqi Marino, Kenneth Daniel Shang, Wenling Ahuja, Arun

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents using reporter neural networks.

IPC Classes ?

G06F 16/9032 - Query formulation
G06F 16/903 - Querying
G06N 3/00 - Computing arrangements based on biological models

18. SCORE MODELLING FOR SIMULATION-BASED INFERENCE

Application Number	EP2023076529
Publication Number	2024/068622
Status	In Force
Filing Date	2023-09-26
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Mnih, Andriy Geffner, Tomas

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for using simulation-based inference to inferring a set of parameters such as measurements, from observations, e.g. real world observations. The method uses a score generation neural network to determine scores for individual observations or for groups of observations that are combined and used to iteratively adjust values of the parameters.

IPC Classes ?

G06F 16/906 - Clustering; Classification
G06N 3/00 - Computing arrangements based on biological models

19. GRAPH NEURAL NETWORKS THAT MODEL FACE-FACE INTERACTIONS BETWEEN MESHES

Application Number	EP2023076797
Publication Number	2024/068788
Status	In Force
Filing Date	2023-09-27
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Allen, Kelsey Rebecca Rubanova, Yulia Lopez Guevara, Tatiana Whitney, William Fairclough Sanchez, Alvaro Battaglia, Peter William Pfaff, Tobias

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for simulating a state of an environment over a sequence of time steps. In one aspect, a method comprises, at each of one or more time steps: obtaining an environment mesh representing the state of the environment at the time step; generating a graph representing the state of the environment at the time step, comprising: determining that a first face of a first object mesh is within a collision distance of a second face of a second object mesh; and in response, instantiating a face-face edge in the graph that connects: (i) a first set of graph nodes in the graph that represent the first face in the first object mesh, and (ii) a second set of graph nodes in the graph that represent the second face in the second object mesh.

IPC Classes ?

G06F 30/20 - Design optimisation, verification or simulation
B25J 9/00 - Programme-controlled manipulators

20. LEARNING TASKS USING SKILL SEQUENCING FOR TEMPORALLY-EXTENDED EXPLORATION

Application Number	EP2023076798
Publication Number	2024/068789
Status	In Force
Filing Date	2023-09-27
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Vezzani, Giulia Tirumala Bukkapatnam, Dhruva Wulfmeier, Markus Riedmiller, Martin Heess, Nicolas Manfred Otto

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling an agent that is interacting with an environment. Implementations of the system use previously learned skills to explore states of the environment to collect and store training data, which is then used to train an action selection system. The system includes a set of skill action selection subsystems, each configured to select actions for the agent to perform for a respective skill. The set of skill action selection subsystems is used to explore states of the environment to collect the training data, keeping their individual action selection policies unchanged. A scheduler neural network selects the skill neural networks to use. The action selection system is trained on the stored training data.

IPC Classes ?

G06N 3/092 - Reinforcement learning
G06N 3/045 - Combinations of networks
G06N 3/084 - Backpropagation, e.g. using gradient descent

21. REINFORCEMENT LEARNING USING DENSITY ESTIMATION WITH ONLINE CLUSTERING FOR EXPLORATION

Application Number	EP2023076893
Publication Number	2024/068841
Status	In Force
Filing Date	2023-09-28
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Saade, Alaa Kapturowski, Steven James Calandriello, Daniele Blundell, Charles Valko, Michal Sprechmann, Pablo Piot, Bilal

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions to be performed by an agent interacting with an environment. Implementations of the described techniques can learn to explore the environment efficiently by storing and updating state embedding cluster centers based on observations characterizing states of the environment.

IPC Classes ?

G06F 16/901 - Indexing; Data structures therefor; Storage structures
G06F 16/906 - Clustering; Classification
G06N 3/092 - Reinforcement learning

22. AGENT CONTROL THROUGH IN-CONTEXT REINFORCEMENT LEARNING

Application Number	EP2023076897
Publication Number	2024/068843
Status	In Force
Filing Date	2023-09-28
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Laskin, Michael Mnih, Volodymyr Wang, Luyu Baveja, Satinder Singh

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an agent can be controlled using an action selection neural network that performs in-context reinforcement learning when controlling an agent on a new task.

IPC Classes ?

G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
G06N 3/092 - Reinforcement learning

23. CONTROLLING AGENTS USING REPORTER NEURAL NETWORKS

Application Number	18475157
Status	Pending
Filing Date	2023-09-26
First Publication Date	2024-04-04
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Dasgupta, Ishita Chen, Shiqi Marino, Kenneth Daniel Shang, Wenling Ahuja, Arun

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents using reporter neural networks.

IPC Classes ?

G06N 3/091 - Active learning
G06F 40/35 - Discourse or dialogue representation

24. DISCRETE TOKEN PROCESSING USING DIFFUSION MODELS

Application Number	EP2023076788
Publication Number	2024/068781
Status	In Force
Filing Date	2023-09-27
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Strudel, Robin Leblond, Rémi Sifre, Laurent Dieleman, Sander Etienne Lea Savinov, Nikolay Grathwohl, Will S. Tallec, Corentin Altché, Florent Ganin, Iaroslav Mensch, Arthur Du, Yilun

Abstract

IPC Classes ?

G06N 3/0475 - Generative networks
G06N 3/088 - Non-supervised learning, e.g. competitive learning

25. REWARD-MODEL BASED REINFORCEMENT LEARNING FOR PERFORMING REASONING TASKS

Application Number	EP2023076792
Publication Number	2024/068784
Status	In Force
Filing Date	2023-09-27
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Higgins, Irina Uesato, Jonathan Ken Kushman, Nathaniel Arthur Kumar, Ramana

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for A training a language model for performing a reasoning task. The system obtains a plurality of training examples. Each training example includes a respective sample query text sequence characterizing a respective sample query and a respective reference response text sequence that includes a reference final answer to the respective sample query. The system trains a reward model on the plurality of training examples. The reward model is configured to receive an input including a query text sequence characterizing a query and one or more reasoning steps that have been generated in response to the query and process the input to compute a reward score indicating how successful the one or more reasoning steps are in yielding a correct final answer to the query. The system trains the language model using the trained reward model.

IPC Classes ?

G06F 16/332 - Query formulation
G06F 16/33 - Querying
G06F 16/338 - Presentation of query results
G06N 3/02 - Neural networks
G06N 3/092 - Reinforcement learning
G06N 5/04 - Inference or reasoning models

26. SYSTEM AND METHOD FOR REINFORCEMENT LEARNING BASED ON PRIOR TRAJECTORIES

Application Number	EP2023076793
Publication Number	2024/068785
Status	In Force
Filing Date	2023-09-27
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Bruce, Jacob Anand, Ankit Fergus, Robert David

Abstract

A reinforcement learning system is proposed in which a policy model neural network is trained to control an agent to perform a task in successive time steps, by training a control system including the policy model neural network to select a respective action for each time step which gives a high value for a reward function based on the action, and which indicates the contribution of the action to solving the task. The reward function includes a term based on a progress value output by a progress model. The progress model generates the progress value upon receiving a first observation of the state of the environment at a time step before the performance of the action, and a second observation of the state of the environment at a time step following the performance of the action. The progress value is an estimate of the average time which an ensemble of experts who produced the demonstrations would have taken to transform the environment from how it appears in the first observation to how it appears in the second observation.

IPC Classes ?

G06N 3/02 - Neural networks
G06N 3/08 - Learning methods
G06N 3/092 - Reinforcement learning

27. NEURAL NETWORKS WITH REGULARIZED ATTENTION LAYERS

Application Number	EP2023076794
Publication Number	2024/068786
Status	In Force
Filing Date	2023-09-27
Publication Date	2024-04-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	He, Bobby Boyi Martens, James

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing a network input using a neural network that includes one or more regularized attention layers. In one aspect, a method comprises: receiving a layer input to a regularized attention layer, wherein the layer input to the regularized attention layer comprises a set of input embeddings; and applying a regularized attention operation over the set of input embeddings to generate a set of output embeddings, comprising: transforming intermediate attention scores using a set of shaping constants to generate a set of transformed attention scores, wherein: values of the shaping constants are initialized prior to training of the neural network and are not adjusted during the training of the neural network; and the values of the shaping constants are selected to regularize the set of output embeddings.

IPC Classes ?

G06N 3/045 - Combinations of networks
G06N 3/08 - Learning methods

28. REWARD-MODEL BASED REINFORCEMENT LEARNING FOR PERFORMING REASONING TASKS

Application Number	18475743
Status	Pending
Filing Date	2023-09-27
First Publication Date	2024-03-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Higgins, Irina Uesato, Jonathan Ken Kushman, Nathaniel Arthur Kumar, Ramana

Abstract

IPC Classes ?

G06N 3/092 - Reinforcement learning

29. GUIDED DIALOGUE USING LANGUAGE GENERATION NEURAL NETWORKS AND SEARCH

Application Number	EP2023075931
Publication Number	2024/061963
Status	In Force
Filing Date	2023-09-20
Publication Date	2024-03-28
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Irving, Geoffrey Glaese, Amelia Marita Claudia Mcaleese-Park, Nathaniel John Hendricks, Lisa Anne Marie

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enabling a user to conduct a dialogue. Implementations of the system learn when to rely on supporting evidence, obtained from an external search system via a search system interface, and are also able to generate replies for the user that align with the preferences of a previously trained response selection neural network. Implementations of the system can also use a previously trained rule violation detection neural network to generate replies that take account of previously learnt rules.

IPC Classes ?

G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/092 - Reinforcement learning
G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks
G06N 3/042 - Knowledge-based neural networks; Logical representations of neural networks
G06N 3/084 - Backpropagation, e.g. using gradient descent
G06N 3/094 - Adversarial learning
G06N 5/01 - Dynamic search techniques; Heuristics; Dynamic trees; Branch-and-bound

30. GENERATING NEURAL NETWORK OUTPUTS BY ENRICHING LATENT EMBEDDINGS USING SELF-ATTENTION AND CROSS-ATTENTION OPERATIONS

Application Number	18271611
Status	Pending
Filing Date	2022-02-03
First Publication Date	2024-03-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Jaegle, Andrew Coulter Carreira, Joao

Abstract

This specification describes a method for using a neural network to generate a network output that characterizes an entity. The method includes: obtaining a representation of the entity as a set of data element embeddings, obtaining a set of latent embeddings, and processing: (i) the set of data element embeddings, and (ii) the set of latent embeddings, using the neural network to generate the network output characterizing the entity. The neural network includes: (i) one or more cross-attention blocks, (ii) one or more self-attention blocks, and (iii) an output block. Each cross-attention block updates each latent embedding using attention over some or all of the data element embeddings. Each self-attention block updates each latent embedding using attention over the set of latent embeddings. The output block processes one or more latent embeddings to generate the network output that characterizes the entity.

IPC Classes ?

G06N 3/0475 - Generative networks
G06N 3/084 - Backpropagation, e.g. using gradient descent

31. SEQUENCE-TO SEQUENCE NEURAL NETWORK SYSTEMS USING LOOK AHEAD TREE SEARCH

Application Number	18274748
Status	Pending
Filing Date	2022-02-08
First Publication Date	2024-03-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Leblond, Rémi Bertrand Francis Alayrac, Jean-Baptiste Sifre, Laurent Pîslar, Miruna Lespiau, Jean-Baptiste Antonoglou, Ioannis Simonyan, Karen Silver, David Vinyals, Oriol

Abstract

A computer-implemented method for generating an output token sequence from an input token sequence. The method combines a look ahead tree search, such as a Monte Carlo tree search, with a sequence-to-sequence neural network system. The sequence-to-sequence neural network system has a policy output defining a next token probability distribution, and may include a value neural network providing a value output to evaluate a sequence. An initial partial output sequence is extended using the look ahead tree search guided by the policy output and, in implementations, the value output, of the sequence-to-sequence neural network system until a complete output sequence is obtained.

IPC Classes ?

G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks

32. GENERATING IMAGES USING SPARSE REPRESENTATIONS

Application Number	18275048
Status	Pending
Filing Date	2022-02-07
First Publication Date	2024-03-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Nash, Charlie Thomas Curtis Battaglia, Peter William

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating compressed representations of synthetic images. One of the methods is a method of generating a synthetic image using a generative neural network, and includes: generating, using the generative neural network, a plurality of coefficients that represent the synthetic image after the synthetic image has been encoded using a lossy compression algorithm; and decoding the synthetic image by applying the lossy compression algorithm to the plurality of coefficients.

IPC Classes ?

G06T 9/00 - Image coding
G06T 7/11 - Region-based segmentation
G06T 7/73 - Determining position or orientation of objects or cameras using feature-based methods

33. TEMPORAL DIFFERENCE SCALING WHEN CONTROLLING AGENTS USING REINFORCEMENT LEARNING

Application Number	18275145
Status	Pending
Filing Date	2022-02-04
First Publication Date	2024-03-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Schaul, Tom

Abstract

A reinforcement learning neural network system configured to manage rewards on scales that can vary significantly. The system determines the value of a scale factor that is applied to a temporal difference error used for reinforcement learning. The scale factor depends at least upon a variance of the rewards received during the reinforcement learning.

IPC Classes ?

G06N 3/092 - Reinforcement learning

34. NEURAL NETWORK REINFORCEMENT LEARNING WITH DIVERSE POLICIES

Application Number	18275511
Status	Pending
Filing Date	2022-02-04
First Publication Date	2024-03-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Zahavy, Tom Ben Zion O'Donoghue, Brendan Timothy Da Motta Salles Barreto, Andre Flennerhag, Johan Sebastian Mnih, Volodymyr Baveja, Satinder Singh

Abstract

In one aspect there is provided a method for training a neural network system by reinforcement learning. The neural network system may be configured to receive an input observation characterizing a state of an environment interacted with by an agent and to select and output an action in accordance with a policy aiming to satisfy an objective. The method may comprise obtaining a policy set comprising one or more policies for satisfying the objective and determining a new policy based on the one or more policies. The determining may include one or more optimization steps that aim to maximize a diversity of the new policy relative to the policy set under the condition that the new policy satisfies a minimum performance criterion based on an expected return that would be obtained by following the new policy.

IPC Classes ?

G06N 3/092 - Reinforcement learning

35. GUIDED DIALOGUE USING LANGUAGE GENERATION NEURAL NETWORKS AND SEARCH

Application Number	18471257
Status	Pending
Filing Date	2023-09-20
First Publication Date	2024-03-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Irving, Geoffrey Glaese, Amelia Marita Claudia Mcaleese-Park, Nathaniel John Hendricks, Lisa Anne Marie

Abstract

IPC Classes ?

G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates
G06F 40/35 - Discourse or dialogue representation
G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks
G06N 3/092 - Reinforcement learning

36. AGENT CONTROL THROUGH IN-CONTEXT REINFORCEMENT LEARNING

Application Number	18477492
Status	Pending
Filing Date	2023-09-28
First Publication Date	2024-03-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Laskin, Michael Mnih, Volodymyr Wang, Luyu Baveja, Satinder Singh

Abstract

IPC Classes ?

G06N 3/08 - Learning methods

37. Image processing of an environment to select an action to be performed by an agent interacting with the environment

Application Number	17737544
Grant Number	11941088
Status	In Force
Filing Date	2022-05-05
First Publication Date	2024-03-26
Grant Date	2024-03-26
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Mnih, Volodymyr Kavukcuoglu, Koray

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing images using recurrent attention. One of the methods includes determining a location in the first image; extracting a glimpse from the first image using the location; generating a glimpse representation of the extracted glimpse; processing the glimpse representation using a recurrent neural network to update a current internal state of the recurrent neural network to generate a new internal state; processing the new internal state to select a location in a next image in the image sequence after the first image; and processing the new internal state to select an action from a predetermined set of possible actions.

IPC Classes ?

G06V 10/44 - Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
G06F 18/2431 - Multiple classes
G06V 20/80 - Recognising image objects characterised by unique random patterns
G06V 30/194 - References adjustable by an adaptive method, e.g. learning
G06V 30/413 - Classification of content, e.g. text, photographs or tables

38. ATTENTION NEURAL NETWORKS WITH SHORT-TERM MEMORY UNITS

Application Number	18275052
Status	Pending
Filing Date	2022-02-07
First Publication Date	2024-03-21
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Banino, Andrea Badia, Adrià Puigdomènech Walker, Jacob Charles Scholtes, Timothy Anthony Julian Mitrovic, Jovana Blundell, Charles

Abstract

A system for controlling an agent interacting with an environment to perform a task. The system includes an action selection neural network configured to generate action selection outputs that are used to select actions to be performed by the agent. The action selection neural network includes an encoder sub network configured to generate encoded representations of the current observations; an attention sub network configured to generate attention sub network outputs with the used of an attention mechanism; a recurrent sub network configured to generate recurrent sub network outputs; and an action selection sub network configured to generate the action selection outputs that are used to select the actions to be performed by the agent in response to the current observations.

IPC Classes ?

G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
G06N 3/092 - Reinforcement learning

39. CONTROLLING INDUSTRIAL FACILITIES USING HIERARCHICAL REINFORCEMENT LEARNING

Application Number	EP2023075295
Publication Number	2024/056800
Status	In Force
Filing Date	2023-09-14
Publication Date	2024-03-21
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Wong, William Dutta, Praneet Luo, Jerry Jiayu

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling a facility through hierarchical reinforcement learning. In particular, the facility is controlled using a high-level controller neural network that makes high-level decisions and a low-level controller neural network that makes low-level controller decisions.

IPC Classes ?

G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric

40. DATA-EFFICIENT REINFORCEMENT LEARNING WITH ADAPTIVE RETURN COMPUTATION SCHEMES

Application Number	EP2023075512
Publication Number	2024/056891
Status	In Force
Filing Date	2023-09-15
Publication Date	2024-03-21
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Jiang, Ray Puigdomènech Badia, Adrià Campos Camúñez, Víctor Kapturowski, Steven James Rakicevic, Nemanja

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for data-efficient reinforcement learning with adaptive return computation schemes.

IPC Classes ?

G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/045 - Combinations of networks
G06N 3/092 - Reinforcement learning
G06N 3/096 - Transfer learning
G06N 3/0464 - Convolutional networks [CNN, ConvNet]
G06N 3/084 - Backpropagation, e.g. using gradient descent
G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn

41. TRAINING POLICY NEURAL NETWORKS IN SIMULATION USING SCENE SYNTHESIS MACHINE LEARNING MODELS

Application Number	EP2023075514
Publication Number	2024/056892
Status	In Force
Filing Date	2023-09-15
Publication Date	2024-03-21
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Byravan, Arunkumar Humplik, Jan Hasenclever, Leonard Brussee, Arthur Karl Nori, Francesco

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network for use in controlling a robot. In particular, the policy neural network can be trained in simulation using images generated by a scene synthesis machine learning model.

IPC Classes ?

G06N 3/0895 - Weakly supervised learning, e.g. semi-supervised or self-supervised learning
G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/092 - Reinforcement learning
G06N 3/096 - Transfer learning
G06N 3/045 - Combinations of networks

42. DETERMINING PRINCIPAL COMPONENTS USING MULTI-AGENT INTERACTION

Application Number	18275045
Status	Pending
Filing Date	2022-02-07
First Publication Date	2024-03-14
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Gemp, Ian Michael Mcwilliams, Brian

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining principal components of a data set using multi-agent interactions. One of the methods includes obtaining initial estimates for a plurality of principal components of a data set; and generating a final estimate for each principal component by repeatedly performing operations comprising: generating a reward estimate using the current estimate of the principal component, wherein the reward estimate is larger if the current estimate of the principal component captures more variance in the data set; generating, for each parent principal component of the principal component, a punishment estimate, wherein the punishment estimate is larger if the current estimate of the principal component and the current estimate of the parent principal component are not orthogonal; and updating the current estimate of the principal component according to a difference between the reward estimate and the punishment estimates.

IPC Classes ?

G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

43. PREDICTING COMPLETE PROTEIN REPRESENTATIONS FROM MASKED PROTEIN REPRESENTATIONS

Application Number	18273594
Status	Pending
Filing Date	2022-01-27
First Publication Date	2024-03-14
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Pritzel, Alexander Ionescu, Catalin-Dumitru Kohl, Simon

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for unmasking a masked representation of a protein using a protein reconstruction neural network. In one aspect, a method comprises: receiving the masked representation of the protein; and processing the masked representation of the protein using the protein reconstruction neural network to generate a respective predicted embedding corresponding to one or more masked embeddings that are included in the masked representation of the protein, wherein a predicted embedding corresponding to a masked embedding in a representation of the amino acid sequence of the protein defines a prediction for an identity of an amino acid at a corresponding position in the amino acid sequence, wherein a predicted embedding corresponding to a masked embedding in a representation of the structure of the protein defines a prediction for a corresponding structural feature of the protein.

IPC Classes ?

G16B 40/20 - Supervised data analysis
G16B 15/20 - Protein or domain folding
G16B 15/30 - Drug targeting using structural data; Docking or binding prediction
G16B 20/50 - Mutagenesis
G16B 30/00 - ICT specially adapted for sequence analysis involving nucleotides or amino acids

44. CONTROLLING AGENTS USING STATE ASSOCIATIVE LEARNING FOR LONG-TERM CREDIT ASSIGNMENT

Application Number	18275542
Status	Pending
Filing Date	2022-02-04
First Publication Date	2024-03-14
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Ritter, Samuel Raposo, David Nunes

Abstract

A computer-implemented reinforcement learning neural network system that learns a model of rewards in order to relate actions by an agent in an environment to their long-term consequences. The model learns to decompose the rewards into components explainable by different past states. That is, the model learns to associate when being in a particular state of the environment is predictive of a reward in a later state, even when the later state, and reward, is only achieved after a very long time delay.

IPC Classes ?

G06N 3/08 - Learning methods

45. ACTION ABSTRACTION CONTROLLER FOR FULLY ACTUATED ROBOTIC MANIPULATORS

Application Number	EP2023067028
Publication Number	2024/051978
Status	In Force
Filing Date	2023-06-22
Publication Date	2024-03-14
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Chen, Jose Enrique Laurens, Antoine Marin Alix Romano, Francesco Scholz, Jonathan Karl Fernandes Martins, Murilo Nori, Francesco

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for controlling a robot manipulator that has a plurality of joints. One of the methods includes obtaining a control input that comprises one or more velocity values that specify a target velocity of a reference point in a given coordinate frame; determining a respective joint velocity for each of the plurality of joints by generating a solution to an optimization problem formulated from the control input; and controlling the robot manipulator, including causing the plurality of joints of the robot manipulator to move in accordance with the respective joint velocities to approximate the control input.

IPC Classes ?

B25J 9/16 - Programme controls
G05B 19/427 - Teaching successive positions by tracking the position of a joystick or handle to control the positioning servo of the tool head, master-slave control

46. CONTROLLING AGENTS USING AMBIGUITY-SENSITIVE NEURAL NETWORKS AND RISK-SENSITIVE NEURAL NETWORKS

Application Number	EP2023074759
Publication Number	2024/052544
Status	In Force
Filing Date	2023-09-08
Publication Date	2024-03-14
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Grau Moya, Jordi Delétang, Grégoire Kunesch, Markus Ortega Caballero, Pedro Alejandro

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling agents. In particular, an agent can be controlled using an action selection system that is risk-sensitive, ambiguity-sensitive, or both.

IPC Classes ?

G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G06N 3/045 - Combinations of networks
G06N 3/092 - Reinforcement learning
G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn
G06N 3/0464 - Convolutional networks [CNN, ConvNet]
G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

47. SELECTION-INFERENCE NEURAL NETWORK SYSTEMS

Application Number	EP2023073796
Publication Number	2024/047108
Status	In Force
Filing Date	2023-08-30
Publication Date	2024-03-07
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Creswell, Antonia Phoebe Nina Shanahan, Murray

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a response to a query input using a selection-inference neural network.

IPC Classes ?

G06N 5/046 - Forward inferencing; Production systems
G06N 3/042 - Knowledge-based neural networks; Logical representations of neural networks
G06N 5/025 - Extracting rules from data
G06N 5/04 - Inference or reasoning models
G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence

48. PREDICTING EXCHANGE-CORRELATION ENERGIES OF ATOMIC SYSTEMS USING NEURAL NETWORKS

Application Number	18260182
Status	Pending
Filing Date	2022-01-07
First Publication Date	2024-02-29
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Kirkpatrick, James Mcmorrow, Brendan Charles Turban, David Herbert Phlipp Gaunt, Alexander Lloyd Spencer, James Matthews, Alexander Graeme De Garis Cohen, Aron Jonathan

Abstract

Methods, computer systems, and apparatus, including computer programs encoded on computer storage media, for predicting an exchange-correlation energy of an atomic system. The system obtains respective electron-orbital features of the atomic system at each of a plurality of grid points; generates, for each of the plurality of grid points, a respective input feature vector for the electron-orbital features at the grid point; and processes the respective input feature vectors for the plurality of grid points using a neural network to generate a predicted exchange-correlation energy of the atomic system.

IPC Classes ?

G16C 20/30 - Prediction of properties of chemical compounds, compositions or mixtures
G06N 3/08 - Learning methods
G16C 20/70 - Machine learning, data mining or chemometrics

49. RENDERING NEW IMAGES OF SCENES USING GEOMETRY-AWARE NEURAL NETWORKS CONDITIONED ON LATENT VARIABLES

Application Number	18275332
Status	Pending
Filing Date	2022-02-04
First Publication Date	2024-02-29
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Kosiorek, Adam Roman Strathmann, Heiko Rezende, Danilo Jimenez Zoran, Daniel Moreno Comellas, Pol

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for rendering a new image that depicts a scene from a perspective of a camera at a new camera location. In one aspect, a method comprises: receiving a plurality of observations characterizing the scene; generating a latent variable representing the scene from the plurality of observations characterizing the scene; conditioning a scene representation neural network on the latent variable representing the scene, wherein the scene representation neural network conditioned on the latent variable representing the scene defines a geometric model of the scene as a three-dimensional (3D) radiance field; and rendering the new image that depicts the scene from the perspective of the camera at the new camera location using the scene representation neural network conditioned on the latent variable representing the scene.

IPC Classes ?

G06T 15/20 - Perspective computation
G06N 3/045 - Combinations of networks
G06N 3/084 - Backpropagation, e.g. using gradient descent
G06T 15/06 - Ray-tracing
G06T 15/50 - Lighting effects

50. DATA-EFFICIENT REINFORCEMENT LEARNING FOR CONTINUOUS CONTROL TASKS

Application Number	18351440
Status	Pending
Filing Date	2023-07-12
First Publication Date	2024-02-22
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Riedmiller, Martin Hafner, Roland Vecerik, Mel Lillicrap, Timothy Paul Lampe, Thomas Popov, Ivaylo Barth-Maron, Gabriel Heess, Nicolas Manfred Otto

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-efficient reinforcement learning. One of the systems is a system for training an actor neural network used to select actions to be performed by an agent that interacts with an environment by receiving observations characterizing states of the environment and, in response to each observation, performing an action selected from a continuous space of possible actions, wherein the actor neural network maps observations to next actions in accordance with values of parameters of the actor neural network, and wherein the system comprises: a plurality of workers, wherein each worker is configured to operate independently of each other worker, wherein each worker is associated with a respective agent replica that interacts with a respective replica of the environment during the training of the actor neural network.

IPC Classes ?

G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/08 - Learning methods
G06N 3/088 - Non-supervised learning, e.g. competitive learning
G06F 18/21 - Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting
G06N 3/045 - Combinations of networks

51. DETERMINING FAILURE CASES IN TRAINED NEURAL NETWORKS USING GENERATIVE NEURAL NETWORKS

Application Number	EP2023072617
Publication Number	2024/038114
Status	In Force
Filing Date	2023-08-16
Publication Date	2024-02-22
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Gowal, Sven Adrian Wiles, Olivia Anne Carneiro De Albuquerque, Isabela Maria

Abstract

Methods, systems, and computer readable storage media for performing operations comprising: obtaining a plurality of initial network inputs that have been classified as belonging to a corresponding ground truth class; processing each of the plurality of initial network inputs using a trained target neural network to generate a respective predicted network output for each initial network input, the respective predicted network output comprising a respective score for each of a plurality of classes, the plurality of classes comprising the ground truth class; identifying, based on the respective predicted network outputs and the ground truth class, a subset of the initial network inputs as having been misclassified by the trained target neural network; and determining, based on the subset of initial network inputs, one or more failure case latent representations, wherein each failure case latent representation is a latent representation that characterizes network inputs that belong to the ground truth class but that are likely to be misclassified by the trained target neural network.

IPC Classes ?

G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
G06N 3/045 - Combinations of networks
G06N 3/0475 - Generative networks
G06N 3/09 - Supervised learning

52. SOLVING MIXED INTEGER PROGRAMS USING NEURAL NETWORKS

Application Number	18267363
Status	Pending
Filing Date	2021-12-20
First Publication Date	2024-02-22
Owner	DeepMind Technologies Limited (USA)
Inventor	Bartunov, Sergey Gimeno Gil, Felix Axel Von Glehn, Ingrid Karin Lichocki, Pawel Lobov, Ivan Nair, Vinod O'Donoghue, Brendan Timothy Sonnerat, Nicolas Tjandraatmadja, Christian Wang, Pengming

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for solving mixed integer programs (MIPs) using neural networks. One of the methods includes obtaining data specifying parameters of a MIP; generating, from the parameters of the MIP, an input representation; processing the input representation using an encoder neural network to generate a respective embedding for each of the integer variables; generating a plurality of partial assignments by selecting a respective second, proper subset of the integer variables; and for each of the variables in the respective second subset, generating, using at least the respective embedding for the variable, a respective additional constraint on the value of the variable; generating, for each of the partial assignments, a corresponding candidate final assignment that assigns a respective value to each of the plurality of variables; and selecting, as a final assignment for the MIP, one of the candidate final assignments.

IPC Classes ?

G06N 3/08 - Learning methods
G06F 17/11 - Complex mathematical operations for solving equations

53. Selecting actions from large discrete action sets using reinforcement learning

Application Number	17131500
Grant Number	11907837
Status	In Force
Filing Date	2020-12-22
First Publication Date	2024-02-20
Grant Date	2024-02-20
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Dulac-Arnold, Gabriel Evans, Richard Andrew Coppin, Benjamin Kenneth

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting actions from large discrete action sets. One of the methods includes receiving a particular observation representing a particular state of an environment; and selecting an action from a discrete set of actions to be performed by an agent interacting with the environment, comprising: processing the particular observation using an actor policy network to generate an ideal point; determining, from the points that represent actions in the set, the k nearest points to the ideal point; for each nearest point of the k nearest points: processing the nearest point and the particular observation using a Q network to generate a respective Q value for the action represented by the nearest point; and selecting the action to be performed by the agent from the k actions represented by the k nearest points based on the Q values.

IPC Classes ?

G06N 3/08 - Learning methods

54. AUTOMATED DISCOVERY OF AGENTS IN SYSTEMS

Application Number	EP2023071987
Publication Number	2024/033387
Status	In Force
Filing Date	2023-08-08
Publication Date	2024-02-15
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Jebreel, Zachary Alex Kenton Kumar, Ramana Richens, Jonathan George Everitt, Tom Åke Helmer Farquhar, Aiken Sebastian Macdermott, Matthew Joseph Tilley

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for identifying agents in a system. According to one aspect, a method comprises: generating data defining a causal model of the system, comprising transmitting instructions to cause a plurality of interventions to be applied to the system, wherein each intervention modifies one or more variable elements in the system; processing the model of the system to identify one or more of the variable elements in the system as being decision elements, wherein each decision element represents an action selected by a respective agent in the system; and identifying one or more agents in the system based on the decision elements; and outputting data that identifies the agents in the system.

IPC Classes ?

G06N 3/092 - Reinforcement learning
G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
G06N 3/042 - Knowledge-based neural networks; Logical representations of neural networks

55. GRAPH NEURAL NETWORK SYSTEMS FOR GENERATING STRUCTURED REPRESENTATIONS OF OBJECTS

Application Number	18144810
Status	Pending
Filing Date	2023-05-08
First Publication Date	2024-02-15
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Li, Yujia Dyer, Christopher James Vinyals, Oriol

Abstract

There is described a neural network system for generating a graph, the graph comprising a set of nodes and edges. The system comprises one or more neural networks configured to represent a probability distribution over sequences of node generating decisions and/or edge generating decisions, and one or more computers configured to sample the probability distribution represented by the one or more neural networks to generate a graph.

IPC Classes ?

G06N 3/047 - Probabilistic or stochastic networks
G06F 16/901 - Indexing; Data structures therefor; Storage structures
G06F 17/18 - Complex mathematical operations for evaluating statistical data
G06N 3/08 - Learning methods
G06N 3/045 - Combinations of networks

56. FINDING A STATIONARY POINT OF A LOSS FUNCTION BY AN ITERATIVE ALGORITHM USING A VARIABLE LEARNING RATE VALUE

Application Number	EP2023072108
Publication Number	2024/033445
Status	In Force
Filing Date	2023-08-09
Publication Date	2024-02-15
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Rosca, Mihaela Dherin, Benoit Richard Umbert Wu, Yan Qin, Chongli

Abstract

A computer-implemented method for determining, for a loss function which is a function of a parameter vector comprising a plurality of parameters, values for the parameters for which the parameter vector is a stationary point of the loss function. The method comprises determining initial values for the parameters; and repeatedly updating the parameters by: (a) determining at least one drift value indicative of discretization drift for a discrete update to the parameters based on the loss function; (b) determining at least one learning rate value by evaluating a learning rate function based on, and having an inverse relationship with, the at least one drift value; (c) determining respective updates to the parameters based upon a product of the at least one learning rate value and a gradient of the loss function with respect to the respective parameter for current values of the parameters; and (d) updating the parameters based upon the determined respective updates.

IPC Classes ?

G06N 3/09 - Supervised learning
G06F 17/11 - Complex mathematical operations for solving equations
G06N 3/0464 - Convolutional networks [CNN, ConvNet]
G06N 3/045 - Combinations of networks
G06N 3/092 - Reinforcement learning
G06N 3/094 - Adversarial learning

57. CONTROLLING AGENTS USING AUXILIARY PREDICTION NEURAL NETWORKS THAT GENERATE STATE VALUE ESTIMATES

Application Number	18230056
Status	Pending
Filing Date	2023-08-03
First Publication Date	2024-02-08
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Zaheer, Muhammad Modayil, Joseph Varughese

Abstract

Method, system, and non-transitory computer storage media for selecting actions to be performed by an agent to interact with an environment to perform a main task by for each time step in a sequence of time steps: receiving a set of features representing an observation; for each of one or more auxiliary prediction neural networks, generating a state value estimate for the current state of the environment relative to a corresponding auxiliary reward that measures values of a corresponding target feature from the set of features representing the observations for the sequence of time steps; processing an input comprising a respective intermediate output generated by each auxiliary neural network at the time step using an action selection neural network to generate an action selection output; and selecting the action to be performed by the agent at the time step using the action selection output.

IPC Classes ?

G06N 3/045 - Combinations of networks
G06N 3/092 - Reinforcement learning

58. JOINTLY UPDATING AGENT CONTROL POLICIES USING ESTIMATED BEST RESPONSES TO CURRENT CONTROL POLICIES

Application Number	18275881
Status	Pending
Filing Date	2022-02-07
First Publication Date	2024-02-08
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Marris, Luke Christopher Muller, Paul Fernand Michel Lanctot, Marc Graepel, Thore Kurt Hartwig

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating control policies for controlling agents in an environment. One of the methods includes, at each of a plurality of iterations: obtaining a current joint control policy for a plurality of agents, the current joint control policy specifying a respective current control policy for each agent; and updating the current joint control policy, comprising, for each agent: generating a respective reward estimate for each of a plurality of alternate control policies that is an estimate of a reward received by the agent if the agent is controlled using the alternate control policy while the other agents are controlled using the respective current control policies; computing a best response for the agent from the respective reward estimates; and updating the respective current control policy for the agent using the best response for the agent.

IPC Classes ?

G06N 3/092 - Reinforcement learning

59. AUGMENTING ATTENTION-BASED NEURAL NETWORKS TO SELECTIVELY ATTEND TO PAST INPUTS

Application Number	18486060
Status	Pending
Filing Date	2023-10-12
First Publication Date	2024-02-08
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Rae, Jack William Potapenko, Anna Lillicrap, Timothy Paul

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for performing a machine learning task on a network input that is a sequence to generate a network output. In one aspect, one of the methods includes, for each particular sequence of layer inputs: for each attention layer in the neural network: maintaining episodic memory data; maintaining compressed memory data; receiving a layer input to be processed by the attention layer; and applying an attention mechanism over (i) the compressed representation in the compressed memory data for the layer, (ii) the hidden states in the episodic memory data for the layer, and (iii) the respective hidden state at each of the plurality of input positions in the particular network input to generate a respective activation for each input position in the layer input.

IPC Classes ?

G06N 3/084 - Backpropagation, e.g. using gradient descent
G06N 3/08 - Learning methods
G06F 18/214 - Generating training patterns; Bootstrap methods, e.g. bagging or boosting
G06N 3/047 - Probabilistic or stochastic networks

60. MULTI-TASK NEURAL NETWORKS WITH TASK-SPECIFIC PATHS

Application Number	18487707
Status	Pending
Filing Date	2023-10-16
First Publication Date	2024-02-08
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Wierstra, Daniel Pieter Fernando, Chrisantha Thomas Pritzel, Alexander Banarse, Dylan Sunil Blundell, Charles Rusu, Andrei-Alexandru Zwols, Yori Ha, David

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for using multi-task neural networks. One of the methods includes receiving a first network input and data identifying a first machine learning task to be performed on the first network input; selecting a path through the plurality of layers in a super neural network that is specific to the first machine learning task, the path specifying, for each of the layers, a proper subset of the modular neural networks in the layer that are designated as active when performing the first machine learning task; and causing the super neural network to process the first network input using (i) for each layer, the modular neural networks in the layer that are designated as active by the selected path and (ii) the set of one or more output layers corresponding to the identified first machine learning task.

IPC Classes ?

G06N 3/086 - Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G06N 3/045 - Combinations of networks

61. DATA-DRIVEN ROBOT CONTROL

Application Number	18331632
Status	Pending
Filing Date	2023-06-08
First Publication Date	2024-02-08
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Cabi, Serkan Wang, Ziyu Novikov, Alexander Konyushkova, Ksenia Gomez Colmenarejo, Sergio Reed, Scott Ellison Denil, Misha Man Ray Scholz, Jonathan Karl Sushkov, Oleg O. Jeong, Rae Chan Barker, David Budden, David Vecerik, Mel Aytar, Yusuf Gomes De Freitas, Joao Ferdinando

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for data-driven robotic control. One of the methods includes maintaining robot experience data; obtaining annotation data; training, on the annotation data, a reward model; generating task-specific training data for the particular task, comprising, for each experience in a second subset of the experiences in the robot experience data: processing the observation in the experience using the trained reward model to generate a reward prediction, and associating the reward prediction with the experience; and training a policy neural network on the task-specific training data for the particular task, wherein the policy neural network is configured to receive a network input comprising an observation and to generate a policy output that defines a control policy for a robot performing the particular task.

IPC Classes ?

B25J 9/16 - Programme controls

62. Reinforcement learning with scheduled auxiliary control

Application Number	16289531
Grant Number	11893480
Status	In Force
Filing Date	2019-02-28
First Publication Date	2024-02-06
Grant Date	2024-02-06
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Riedmiller, Martin Hafner, Roland

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for reinforcement learning with scheduled auxiliary tasks. In one aspect, a method includes maintaining data specifying parameter values for a primary policy neural network and one or more auxiliary neural networks; at each of a plurality of selection time steps during a training episode comprising a plurality of time steps: receiving an observation, selecting a current task for the selection time step using a task scheduling policy, processing an input comprising the observation using the policy neural network corresponding to the selected current task to select an action to be performed by the agent in response to the observation, and causing the agent to perform the selected action.

IPC Classes ?

G06N 3/08 - Learning methods
G06N 3/04 - Architecture, e.g. interconnection topology
G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks

63. JOINTLY LEARNING EXPLORATORY AND NON-EXPLORATORY ACTION SELECTION POLICIES

Application Number	18334112
Status	Pending
Filing Date	2023-06-13
First Publication Date	2024-01-25
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Badia, Adrià Puigdomènech Sprechmann, Pablo Vitvitskyi, Alex Guo, Zhaohan Piot, Bilal Kapturowski, Steven James Tieleman, Olivier Blundell, Charles

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by an agent interacting with an environment. In one aspect, the method comprises: receiving an observation characterizing a current state of the environment; processing the observation and an exploration importance factor using the action selection neural network to generate an action selection output; selecting an action to be performed by the agent using the action selection output; determining an exploration reward; determining an overall reward based on: (i) the exploration importance factor, and (ii) the exploration reward; and training the action selection neural network using a reinforcement learning technique based on the overall reward.

IPC Classes ?

G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/084 - Backpropagation, e.g. using gradient descent
G06F 18/22 - Matching criteria, e.g. proximity measures
G06V 10/764 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

64. ACTION CLASSIFICATION IN VIDEO CLIPS USING ATTENTION-BASED NEURAL NETWORKS

Application Number	18375941
Status	Pending
Filing Date	2023-10-02
First Publication Date	2024-01-25
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Carreira, Joao Doersch, Carl Zisserman, Andrew

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for classifying actions in a video. One of the methods obtaining a feature representation of a video clip; obtaining data specifying a plurality of candidate agent bounding boxes in the key video frame; and for each candidate agent bounding box: processing the feature representation through an action transformer neural network.

IPC Classes ?

G06V 20/40 - Scenes; Scene-specific elements in video content
G06V 10/25 - Determination of region of interest [ROI] or a volume of interest [VOI]
G06N 3/045 - Combinations of networks

65. OPTIMIZING ALGORITHMS FOR TARGET PROCESSORS USING REPRESENTATION NEURAL NETWORKS

Application Number	EP2023070308
Publication Number	2024/018065
Status	In Force
Filing Date	2023-07-21
Publication Date	2024-01-25
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Michi, Andrea Mankowitz, Daniel J. Zhernov, Anton Gelmi, Marco Oreste Selvi, Marco Paduraru, Cosmin Leurent, Edouard Mandhane, Amol Balkishan Iqbal, Shariq Nadeem Silver, David Riedmiller, Martin Kohli, Pushmeet Vinyals, Oriol

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for optimizing a target algorithm using a state representation neural network.

IPC Classes ?

G06F 8/41 - Compilation
G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks
G06N 3/02 - Neural networks

66. NEURAL NETWORKS IMPLEMENTING ATTENTION OVER OBJECT EMBEDDINGS FOR OBJECT-CENTRIC VISUAL REASONING

Application Number	18029980
Status	Pending
Filing Date	2021-10-01
First Publication Date	2024-01-18
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Ding, Fengning Santoro, Adam Anthony Hill, Felix George Botvinick, Matthew Piloto, Luis

Abstract

A video processing system configured to analyze a sequence of video frames to detect objects in the video frames and provide information relating to the detected objects in response to a query. The query may comprise, for example, a request for a prediction of a future event, or of the location of an object, or a request for a prediction of what would happen if an object were modified. The system uses a transformer neural network subsystem to process representations of objects in the video.

IPC Classes ?

G06V 20/40 - Scenes; Scene-specific elements in video content
G06V 10/26 - Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 10/776 - Validation; Performance evaluation

67. Selecting reinforcement learning actions using a low-level controller

Application Number	17541186
Grant Number	11875258
Status	In Force
Filing Date	2021-12-02
First Publication Date	2024-01-16
Grant Date	2024-01-16
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Heess, Nicolas Manfred Otto Lillicrap, Timothy Paul Wayne, Gregory Duncan Tassa, Yuval

Abstract

Methods, systems, and apparatus for selecting actions to be performed by an agent interacting with an environment. One system includes a high-level controller neural network, low-level controller network, and subsystem. The high-level controller neural network receives an input observation and processes the input observation to generate a high-level output defining a control signal for the low-level controller. The low-level controller neural network receives a designated component of an input observation and processes the designated component and an input control signal to generate a low-level output that defines an action to be performed by the agent in response to the input observation. The subsystem receives a current observation characterizing a current state of the environment, determines whether criteria are satisfied for generating a new control signal, and based on the determination, provides appropriate inputs to the high-level and low-level controllers for selecting an action to be performed by the agent.

IPC Classes ?

G06N 3/08 - Learning methods
G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G06N 3/045 - Combinations of networks

68. VOCABULARY SELECTION FOR TEXT PROCESSING TASKS USING POWER INDICES

Application Number	18038631
Status	Pending
Filing Date	2021-11-22
First Publication Date	2024-01-11
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Gemp, Ian Michael Bachrach, Yoram Patel, Roma Dyer, Christopher James

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for selecting an input vocabulary for a machine learning model using power indices. One of the methods includes computing a respective score for each of a plurality of text tokens in an initial vocabulary and then selecting the text tokens in the input vocabulary based on the respective scores.

IPC Classes ?

G10L 13/047 - Architecture of speech synthesisers
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination

69. MODEL-FREE REINFORCEMENT LEARNING WITH REGULARIZED NASH DYNAMICS

Application Number	EP2023067491
Publication Number	2024/003058
Status	In Force
Filing Date	2023-06-27
Publication Date	2024-01-04
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Perolat, Julien De Vylder, Bart Tuyls, Karl Paul

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a policy neural network that is used to control an agent. In particular, the policy neural network can be trained through model-free reinforcement learning with regularized Nash dynamics.

IPC Classes ?

G06N 3/092 - Reinforcement learning
G06N 3/045 - Combinations of networks

70. RECURRENT NEURAL NETWORKS FOR DATA ITEM GENERATION

Application Number	18367305
Status	Pending
Filing Date	2023-09-12
First Publication Date	2023-12-28
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Gregor, Karol Danihelka, Ivo

Abstract

Methods, and systems, including computer programs encoded on computer storage media for generating data items. A method includes reading a glimpse from a data item using a decoder hidden state vector of a decoder for a preceding time step, providing, as input to a encoder, the glimpse and decoder hidden state vector for the preceding time step for processing, receiving, as output from the encoder, a generated encoder hidden state vector for the time step, generating a decoder input from the generated encoder hidden state vector, providing the decoder input to the decoder for processing, receiving, as output from the decoder, a generated a decoder hidden state vector for the time step, generating a neural network output update from the decoder hidden state vector for the time step, and combining the neural network output update with a current neural network output to generate an updated neural network output.

IPC Classes ?

G06N 3/04 - Architecture, e.g. interconnection topology
G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G06N 3/045 - Combinations of networks

71. SIMULATING INDUSTRIAL FACILITIES FOR CONTROL

Application Number	EP2023067148
Publication Number	2023/247767
Status	In Force
Filing Date	2023-06-23
Publication Date	2023-12-28
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Dutta, Praneet Chervonyi, Iurii Voicu, Octavian Luo, Jerry Jiayu Trochim, Piotr

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for simulating industrial facilities for control. One of the methods includes. at each of a plurality of time steps during a task episode: receiving, from a computer simulator of an industrial facility, measurements representing a current state of the facility; generating, from the measurements, an observation; providing the observation as input to a control policy for controlling the facility; receiving, as output, an action for controlling one or more setpoints of the facility; generating, from the action, one or more control inputs for the one or more setpoints of the facility; and providing, as input to the simulator, (i) the control inputs and (ii) current values for one or more configuration parameters of the simulator to cause the simulator to generate, as output, new measurements representing a new state of the facility.

IPC Classes ?

G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
G05B 19/418 - Total factory control, i.e. centrally controlling a plurality of machines, e.g. direct or distributed numerical control (DNC), flexible manufacturing systems (FMS), integrated manufacturing systems (IMS), computer integrated manufacturing (CIM)

72. PREDICTING PROTEIN STRUCTURES USING PROTEIN GRAPHS

Application Number	18034989
Status	Pending
Filing Date	2021-11-23
First Publication Date	2023-12-21
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Pritzel, Alexander Figurnov, Mikhail Jumper, John

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for determining a predicted structure of a protein. According to one aspect, there is provided a method comprising maintaining graph data representing a graph of the protein; obtaining a respective pair embedding for each edge in the graph; processing the pair embeddings using a sequence of update blocks, wherein each update block performs operations comprising, for each edge in the graph: generating a respective representation of each of a plurality of cycles in the graph that include the edge by, for each cycle, processing embeddings for edges in the cycle in accordance with the values of the update block parameters of the update block to generate the representation of the cycle; and updating the pair embedding for the edge using the representations of the cycles in the graph that include the edge.

IPC Classes ?

G16B 15/20 - Protein or domain folding
G16B 15/30 - Drug targeting using structural data; Docking or binding prediction
G16B 30/10 - Sequence alignment; Homology search
G16B 40/20 - Supervised data analysis
G06N 3/08 - Learning methods

73. Simulating Physical Environments with Discontinuous Dynamics Using Graph Neural Networks

Application Number	EP2023066187
Publication Number	2023/242378
Status	In Force
Filing Date	2023-06-15
Publication Date	2023-12-21
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Allen, Kelsey Rebecca Lopez Guevara, Tatiana Pfaff, Tobias Sanchez, Alvaro Rubanova, Yulia Stachenfeld, Kimberly Battaglia, Peter William

Abstract

This specification describes a simulation system that performs simulations of physical environments using a graph neural network. At each of one or more time steps in a sequence of time steps in a given time interval, the system can process a representation of a current state of the physical environment at the current time step using the graph neural network to generate a prediction of a next state of the physical environment at the next time step. Generally, the environment has discontinuous dynamics at one or more time points during the time interval.

IPC Classes ?

G06F 30/20 - Design optimisation, verification or simulation
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
G06F 119/12 - Timing analysis or timing optimisation

74. Distributional reinforcement learning for continuous control tasks

Application Number	18303117
Grant Number	11948085
Status	In Force
Filing Date	2023-04-19
First Publication Date	2023-12-21
Grant Date	2024-04-02
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Budden, David Hoffman, Matthew William Barth-Maron, Gabriel

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training an action selection neural network that is used to select actions to be performed by a reinforcement learning agent interacting with an environment. In particular, the actions are selected from a continuous action space and the system trains the action selection neural network jointly with a distribution Q network that is used to update the parameters of the action selection neural network.

IPC Classes ?

G06N 3/08 - Learning methods
G06N 3/045 - Combinations of networks

75. TRAINING CAMERA POLICY NEURAL NETWORKS THROUGH SELF PREDICTION

Application Number	EP2023066186
Publication Number	2023/242377
Status	In Force
Filing Date	2023-06-15
Publication Date	2023-12-21
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Grimes, Matthew Koichi Mirowski, Piotr Wojciech Modayil, Joseph Varughese

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a camera policy neural network.

IPC Classes ?

G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/045 - Combinations of networks
G06N 3/0895 - Weakly supervised learning, e.g. semi-supervised or self-supervised learning
G06N 3/092 - Reinforcement learning

76. PREDICTING PROTEIN STRUCTURES OVER MULTIPLE ITERATIONS USING RECYCLING

Application Number	18034280
Status	Pending
Filing Date	2021-11-23
First Publication Date	2023-12-14
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Jumper, John Figurnov, Mikhail

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for predicting a structure of a protein comprising one or more chains. In one aspect, a method comprises, at each subsequent iteration after a first iteration in a sequence of iterations: obtaining a network input for the subsequent iteration that characterizes the protein; generating, from (i) structure parameters generated at a preceding iteration that precedes the subsequent iteration in the sequence, (ii) one or intermediate outputs generated by the protein structure prediction neural network while generating the structure parameters at the last iteration, or (iii) both, features for the subsequent iteration; and processing the features and the network input for the subsequent iteration using the protein structure prediction neural network to generate structure parameters for the subsequent iteration that define another predicted structure for the protein.

IPC Classes ?

G16B 40/20 - Supervised data analysis
G16B 15/20 - Protein or domain folding

77. HIERARCHICAL REINFORCEMENT LEARNING AT SCALE

Application Number	EP2023065305
Publication Number	2023/237635
Status	In Force
Filing Date	2023-06-07
Publication Date	2023-12-14
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Soyer, Hubert Josef Behbahani, Feryal Keck, Thomas Albert Nikiforou, Kyriacos Pires, Bernardo Avila Baveja, Satinder Singh

Abstract

The invention describes a system and a method for controlling an agent interacting with an environment to perform a task, the method comprising, at each of a plurality of first time steps from a plurality of time steps: receiving an observation characterizing a state of the environment at the first time step; determining a goal representation for the first time step that characterizes a goal state of the environment to be reached by the agent; processing the observation and the goal representation using a low-level controller neural network to generate a low-level policy output that defines an action to be performed by the agent in response to the observation, wherein the low-level controller neural network comprises: a representation neural network configured to process the observation to generate an internal state representation of the observation, and a low-level policy head configured to process the state observation representation and the goal representation to generate the low-level policy output; and controlling the agent using the low-level policy output.

IPC Classes ?

G06N 3/006 - Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
G06N 3/045 - Combinations of networks
G06N 3/092 - Reinforcement learning
G06N 7/01 - Probabilistic graphical models, e.g. probabilistic networks
G06N 3/044 - Recurrent networks, e.g. Hopfield networks

78. REINFORCEMENT LEARNING TO EXPLORE ENVIRONMENTS USING META POLICIES

Application Number	EP2023065306
Publication Number	2023/237636
Status	In Force
Filing Date	2023-06-07
Publication Date	2023-12-14
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Zintgraf, Luisa Maria Magalhaes Marinho, Zita Alexandra Kemaev, Iurii Kirsch, Louis Michel Oh, Junhyuk Schaul, Tom

Abstract

The invention describes the method performed by one or more computers and for training a base policy neural network that is configured to receive a base policy input comprising an observation of a state of an environment and to process the policy input to generate a base policy output that defines an action to be performed by an agent in response to the observation, the method comprising: generating training data for training the base policy neural network by controlling an agent using (i) the base policy neural network and (ii) an exploration strategy that maps, in accordance with a set of one or more parameters, base policy outputs generated by the base policy neural network to actions performed by the agent to interact with an environment, the generating comprising, at each of a plurality of time points: determining that criteria for updating the exploration strategy are satisfied at the time point; and in response to determining that the criteria are satisfied: generating a meta policy input that comprises data characterizing a performance of the base policy neural network in controlling the agent at the time point; processing the meta policy input using a meta policy to generate a meta policy output that specifies respective values for each of the set of one or more parameters that define the exploration strategy; and controlling the agent using the base policy neural network and in accordance with the exploration strategy defined by the respective values for the set of one or more parameters specified by the meta policy output.

IPC Classes ?

G06N 3/008 - Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
G06N 3/045 - Combinations of networks
G06N 3/092 - Reinforcement learning
G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn

79. TRAINING A SPEAKER NEURAL NETWORK USING ONE OR MORE LISTENER NEURAL NETWORKS

Application Number	18199896
Status	Pending
Filing Date	2023-05-19
First Publication Date	2023-12-14
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Singh, Aaditya K. Ding, Fengning Hill, Felix George Lampinen, Andrew Kyle

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a speaker neural network using one or more listener neural networks.

IPC Classes ?

G06V 10/82 - Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
G06V 20/62 - Text, e.g. of license plates, overlay texts or captions on TV images

80. Learning abstractions using patterns of activations of a neural network hidden layer

Application Number	16916939
Grant Number	11842270
Status	In Force
Filing Date	2020-06-30
First Publication Date	2023-12-12
Grant Date	2023-12-12
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Lerchner, Alexander Hassabis, Demis

Abstract

We describe an artificial neural network comprising: an input layer of input neurons, one or more hidden layers of neurons in successive layers of neurons above the input layer, and at least one further, concept-identifying layer of neurons above the hidden layers. The neural network includes an activation memory coupled to an intermediate, hidden layer of neurons between the input concept-identifying layers to store a pattern of activation of the intermediate layer. The neural network further includes a system to determine an overlap between a plurality of the stored patterns of activation and to activate in the intermediate hidden layer an overlap pattern such that the concept-identifying layer of neurons is configured to identify features of the overlap patterns. We also describe related methods, processor control code, and computing systems for the neural network. Optionally further, higher level concept-identifying layers of neurons may be included.

IPC Classes ?

G06N 3/08 - Learning methods

81. PREDICTING PROTEIN STRUCTURES USING AUXILIARY FOLDING NETWORKS

Application Number	18034006
Status	Pending
Filing Date	2021-11-23
First Publication Date	2023-12-07
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Kohl, Simon Ronneberger, Olaf Figurnov, Mikhail Pritzel, Alexander

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a structure prediction neural network that comprises an embedding neural network and a main folding neural network. According to one aspect, a method comprises: obtaining a training network input characterizing a training protein; processing the training network input using the embedding neural network and the main folding neural network to generate a main structure prediction; for each auxiliary folding neural network in a set of one or more auxiliary folding neural networks, processing at least a corresponding intermediate output of the embedding neural network to generate an auxiliary structure prediction; determining a gradient of an objective function that includes a respective auxiliary structure loss term for each of the auxiliary folding neural networks; and updating the current values of the embedding network parameters and the main folding parameters based on the gradient.

IPC Classes ?

G16B 15/20 - Protein or domain folding
G16B 15/30 - Drug targeting using structural data; Docking or binding prediction
G06N 3/08 - Learning methods
G16B 40/20 - Supervised data analysis

82. SIMULATING PHYSICAL ENVIRONMENTS USING FINE-RESOLUTION AND COARSE-RESOLUTION MESHES

Application Number	EP2023063755
Publication Number	2023/227586
Status	In Force
Filing Date	2023-05-23
Publication Date	2023-11-30
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Fortunato, Meire Pfaff, Tobias Wirnsberger, Peter Pritzel, Alexander Battaglia, Peter William

Abstract

47 ABSTRACT Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for simulating a state of a physical environment. In one aspect, a method performed by one or more computers for simulating the state of the physical environment is provided. The method includes, for each of multiple time steps: obtaining data defining a fine-resolution mesh and a coarse-resolution mesh that each characterize the state of the physical environment at the current time step, where the fine-resolution mesh has a higher resolution than the coarse-resolution mesh; processing data defining the fine- resolution mesh and the coarse-resolution mesh using a graph neural network that includes: (i) one or more fine-resolution update blocks, (ii) one or more coarse-resolution update blocks, and (iii) one or more up-sampling update blocks; and determining the state of the physical environment at a next time step using updated node embeddings for nodes in the fine-resolution mesh. DeepMind Technologies Limited F&R Ref.: 45288-0255WO1 PCT Application

IPC Classes ?

G06F 30/23 - Design optimisation, verification or simulation using finite element methods [FEM] or finite difference methods [FDM]
G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
G06N 3/02 - Neural networks
G06T 17/20 - Wire-frame description, e.g. polygonalisation or tessellation
G06F 111/10 - Numerical modelling
G06F 113/08 - Fluids

83. EXPLORATION BY BOOTSTEPPED PREDICTION

Application Number	EP2023063282
Publication Number	2023/222772
Status	In Force
Filing Date	2023-05-17
Publication Date	2023-11-23
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Guo, Zhaohan Altché, Florent Tallec, Corentin Pires, Bernardo Avila Pîslar, Miruna Thakoor, Shantanu Yogeshraj Azar, Mohammad Gheshlaghi Piot, Bilal

Abstract

An iterative method is proposed to train an action selection system of a reinforcement learning system, based on a reward function which defines a reward value for each action. The reward value includes an intrinsic reward term generated based on the outputs of two encoder models: an online encoder model and a target encoder model. The online encoder model is iteratively trained based on a loss function, and the target encoder model is updated to bring it closer to the online encoder model.

IPC Classes ?

G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G06N 3/045 - Combinations of networks
G06N 3/092 - Reinforcement learning

84. MACHINE LEARNING SYSTEMS WITH COUNTERFACTUAL INTERVENTIONS

Application Number	EP2023063488
Publication Number	2023/222884
Status	In Force
Filing Date	2023-05-19
Publication Date	2023-11-23
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Rabinowitz, Neil Charles Roy, Nicholas Andrew Kim, Junkyung

Abstract

Systems, methods, and computer programs, for training and using a machine learning system to control an agent to perform a task. The machine learning system is trained using counterfactual internal states so that it can provide an output that explains the behavior of the system in causal terms, e.g. in terms of aspects of its environment that cause the system to select particular actions for the agent.

IPC Classes ?

G06N 3/008 - Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G06N 3/045 - Combinations of networks
G06N 3/084 - Backpropagation, e.g. using gradient descent

85. LARGE-SCALE RETRIEVAL AUGMENTED REINFORCEMENT LEARNING

Application Number	EP2023063492
Publication Number	2023/222885
Status	In Force
Filing Date	2023-05-19
Publication Date	2023-11-23
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Humphreys, Peter Conway Guez, Arthur Clement

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for controlling a reinforcement learning agent in an environment to perform a task. In one aspect, a method comprises: maintaining a retrieval dataset that stores a plurality of history observations and, for each history observation, a respective associated context; receiving a current observation characterizing a current state of the environment; selecting one or more history observations from the plurality of history observations; processing, using an encoder neural network and in accordance with current values of encoder network parameters, an encoder network input comprising (i) the current observation and (ii) the one or more selected history observations and their respective associated context to generate a latent state representation for the current state of the environment; and using the latent state representation to determine an action to be performed by the agent in response to the current observation.

IPC Classes ?

G06F 16/908 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
G06N 3/02 - Neural networks
G06N 20/00 - Machine learning

86. CONTRASTIVE LEARNING USING POSITIVE PSEUDO LABELS

Application Number	EP2023063496
Publication Number	2023/222889
Status	In Force
Filing Date	2023-05-19
Publication Date	2023-11-23
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Mitrovic, Jovana Bosnjak, Matko Richemond, Pierre Tomasev, Nenad Strub, Florian Walker, Jacob Charles Hill, Felix George Buesing, Lars Pascanu, Razvan Blundell, Charles

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training a neural network to perform a machine learning task on one or more received inputs by using a hybrid training dataset with a semi-supervised learning technique. The hybrid training dataset includes multiple unlabeled training inputs and multiple labeled training inputs and, in some cases, more unlabeled training inputs than labeled training inputs.

IPC Classes ?

G06N 3/045 - Combinations of networks
G06N 3/0464 - Convolutional networks [CNN, ConvNet]
G06N 3/084 - Backpropagation, e.g. using gradient descent
G06N 3/0895 - Weakly supervised learning, e.g. semi-supervised or self-supervised learning

87. TRAINING REINFORCEMENT LEARNING AGENTS USING AUGMENTED TEMPORAL DIFFERENCE LEARNING

Application Number	18029979
Status	Pending
Filing Date	2021-10-01
First Publication Date	2023-11-23
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Gulcehre, Caglar Pascanu, Razvan Gomez, Sergio

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network used to select actions performed by an agent interacting with an environment by performing actions that cause the environment to transition states. One of the methods includes maintaining a replay memory storing a plurality of transitions; selecting a plurality of transitions from the replay memory; and training the neural network on the plurality of transitions, comprising, for each transition: generating an initial Q value for the transition; determining a scaled Q value for the transition; determining a scaled temporal difference learning target for the transition; determining an error between the scaled temporal difference learning target and the scaled Q value; determining an update to the current values of the Q network parameters; and determining an update to the current value of the scaling term.

IPC Classes ?

G06N 3/092 - Reinforcement learning
G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]

88. TRAINING MACHINE LEARNING MODELS BY DETERMINING UPDATE RULES USING NEURAL NETWORKS

Application Number	18180754
Status	Pending
Filing Date	2023-03-08
First Publication Date	2023-11-23
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Denil, Misha Man Ray Schaul, Tom Andrychowicz, Marcin Gomes De Freitas, Joao Ferdinando Colmenarejo, Sergio Gomez Hoffman, Matthew William Pfau, David Benjamin

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media for training machine learning models. One method includes obtaining a machine learning model, wherein the machine learning model comprises one or more model parameters, and the machine learning model is trained using gradient descent techniques to optimize an objective function; determining an update rule for the model parameters using a recurrent neural network (RNN); and applying a determined update rule for a final time step in a sequence of multiple time steps to the model parameters.

IPC Classes ?

G06N 3/084 - Backpropagation, e.g. using gradient descent
G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G06N 3/045 - Combinations of networks

89. RESOURCE NAVIGATION USING NEURAL NETWORKS

Application Number	EP2023063486
Publication Number	2023/222882
Status	In Force
Filing Date	2023-05-19
Publication Date	2023-11-23
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Marino, Kenneth Daniel Zaheer, Manzil Fergus, Robert David Grathwohl, Will S.

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for resource navigation using neural networks.

IPC Classes ?

G06F 16/33 - Querying
G06N 3/02 - Neural networks
G06F 16/953 - Querying, e.g. by the use of web search engines

90. DETERMINING GENERALIZED EIGENVECTORS USING MULTI-AGENT INTERACTIONS

Application Number	EP2023063487
Publication Number	2023/222883
Status	In Force
Filing Date	2023-05-19
Publication Date	2023-11-23
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Gemp, Ian Michael Mcwilliams, Brian Chen, Charlie Xiangyu

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for determining generalized eigenvectors that characterize a data set.

IPC Classes ?

G06F 17/16 - Matrix or vector computation

91. INTRA-AGENT SPEECH TO FACILITATE TASK LEARNING

Application Number	EP2023063494
Publication Number	2023/222887
Status	In Force
Filing Date	2023-05-19
Publication Date	2023-11-23
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Yan, Chen Carnevale, Federico Javier Georgiev, Petko Ivanov Santoro, Adam Anthony Guy, Aurelia Adrianna Muldal, Alistair Michael Hung, Chia-Chun Abramson, Joshua Simon Lillicrap, Timothy Paul Wayne, Gregory Duncan

Abstract

Systems, methods, and computer programs for learning to control an embodied agent to perform tasks. The techniques use internal, "intra-agent" speech when learning, and are thus able to perform tasks involving new objects without any direct experience of interacting with those objects, i.e. zero-shot. Implementations of the techniques use an image captioning neural network system to generate natural language captions used when training an action selection neural network system.

IPC Classes ?

G06N 3/008 - Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
G06N 3/0455 - Auto-encoder networks; Encoder-decoder networks
G06N 3/044 - Recurrent networks, e.g. Hopfield networks
G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
G06N 3/096 - Transfer learning

92. SELECTION-INFERENCE NEURAL NETWORK SYSTEMS

Application Number	18317878
Status	Pending
Filing Date	2023-05-15
First Publication Date	2023-11-16
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Creswell, Antonia Phoebe Nina

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a response to a query input using a selection-inference neural network.

IPC Classes ?

B60W 50/06 - Improving the dynamic response of the control system, e.g. improving the speed of regulation or avoiding hunting or overshoot
G05B 13/02 - Adaptive control systems, i.e. systems automatically adjusting themselves to have a performance which is optimum according to some preassigned criterion electric
G06F 40/20 - Natural language analysis
B60W 60/00 - Drive control systems specially adapted for autonomous road vehicles
B60W 50/02 - Ensuring safety in case of control system failures, e.g. by diagnosing, circumventing or fixing failures

93. VARIABLE RESOLUTION VARIABLE FRAME RATE VIDEO CODING USING NEURAL NETWORKS

Application Number	EP2023062431
Publication Number	2023/217867
Status	In Force
Filing Date	2023-05-10
Publication Date	2023-11-16
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Assael, Ioannis Alexandros Shillingford, Brendan

Abstract

Systems and methods for encoding video, and for decoding video at an arbitrary temporal and/or spatial resolution. The techniques use a scene representation neural network that, in implementations, is configured to represent frames of a 2D or 3D video as a 3D model encoded in the parameters of the neural network.

IPC Classes ?

G06N 3/04 - Architecture, e.g. interconnection topology
H04N 19/31 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the temporal domain
H04N 19/33 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using hierarchical techniques, e.g. scalability in the spatial domain

94. NEGOTIATING CONTRACTS FOR AGENT COOPERATION IN MULTI-AGENT SYSTEMS

Application Number	EP2023062432
Publication Number	2023/217868
Status	In Force
Filing Date	2023-05-10
Publication Date	2023-11-16
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Bachrach, Yoram Tacchetti, Andrea Gemp, Ian Michael Kramár, János Malinowski, Mateusz Mckee, Kevin Robert

Abstract

Methods, systems and apparatus, including computer programs encoded on computer storage media, for enabling agents to cooperate with one another in a way that improves their collective efficiency. The agents can modify their behavior by taking into account the behavior of other agents, so that a better overall result can be achieved than if each agent acted independently. This is done by enabling the agents to negotiate contracts with one another that restrict their respective actions.

IPC Classes ?

G06Q 10/04 - Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
G06N 3/02 - Neural networks

95. CONSTRAINED REINFORCEMENT LEARNING NEURAL NETWORK SYSTEMS USING PARETO FRONT OPTIMIZATION

Application Number	18029992
Status	Pending
Filing Date	2021-10-01
First Publication Date	2023-11-16
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Huang, Sandy Han Abdolmaleki, Abbas

Abstract

A system and method that controls an agent to perform a task subject to one or more constraints. The system trains a preference neural network that learns which preferences produce constraint-satisfying action selection policies. Thus the system optimizes a hierarchical policy that is a product of a preference policy and a preference-conditioned action selection policy. Thus the system learns to jointly optimize a set of objectives relating to rewards and costs received during the task whilst also learning preferences, i.e. trade-offs between the rewards and costs, that are most likely to produce policies that satisfy the constraints.

IPC Classes ?

G06N 3/092 - Reinforcement learning

96. SELECTION-INFERENCE NEURAL NETWORK SYSTEMS

Application Number	EP2023062781
Publication Number	2023/218040
Status	In Force
Filing Date	2023-05-12
Publication Date	2023-11-16
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Creswell, Antonia Phoebe Nina

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for generating a response to a query input using a selection- inference neural network.

IPC Classes ?

G06N 3/045 - Combinations of networks
G06F 40/20 - Natural language analysis
G06N 5/02 - Knowledge representation; Symbolic representation
G06N 3/042 - Knowledge-based neural networks; Logical representations of neural networks
G06F 40/35 - Discourse or dialogue representation
G06N 5/045 - Explanation of inference; Explainable artificial intelligence [XAI]; Interpretable artificial intelligence
G06N 3/096 - Transfer learning
G06N 5/046 - Forward inferencing; Production systems

97. SIMULATING PHYSICAL ENVIRONMENTS USING GRAPH NEURAL NETWORKS

Application Number	18027174
Status	Pending
Filing Date	2021-10-01
First Publication Date	2023-11-09
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Sanchez, Alvaro Godwin, Jonathan William Ying, Rex Pfaff, Tobias Fortunato, Meire Battaglia, Peter William

Abstract

This specification describes a simulation system that performs simulations of physical environments using a graph neural network. At each of one or more time steps in a sequence of time steps, the system can process a representation of a current state of the physical environment at the current time step using the graph neural network to generate a prediction of a next state of the physical environment at the next time step. Some implementations of the system are adapted for hardware GLOBAL acceleration. As well as performing simulations, the system can be used to predict physical quantities based on measured real-world data. Implementations of the system are differentiable and can also be used for design optimization, and for optimal control tasks.

IPC Classes ?

G06F 30/27 - Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model

98. DATA COMPRESSION AND RECONSTRUCTION USING SPARSE META-LEARNED NEURAL NETWORKS

Application Number	EP2023061711
Publication Number	2023/213903
Status	In Force
Filing Date	2023-05-03
Publication Date	2023-11-09
Owner	DEEPMIND TECHNOLOGIES LIMITED (United Kingdom)
Inventor	Schwarz, Jonathan

Abstract

Methods, systems, and apparatus, including computer programs encoded on computer storage media, for compressing and decompressing data signals using sparse, meta-learned neural networks.

IPC Classes ?

G06N 3/045 - Combinations of networks
G06N 3/0495 - Quantised networks; Sparse networks; Compressed networks
G06N 3/082 - Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
G06N 3/084 - Backpropagation, e.g. using gradient descent
G06N 3/0985 - Hyperparameter optimisation; Meta-learning; Learning-to-learn

99. TRAINING PROTEIN STRUCTURE PREDICTION NEURAL NETWORKS USING REDUCED MULTIPLE SEQUENCE ALIGNMENTS

Application Number	18025689
Status	Pending
Filing Date	2021-08-12
First Publication Date	2023-11-09
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Evans, Richard Andrew Jumper, John Green, Timothy Frederick Goldie Reiman, David

Abstract

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for training neural networks to predict the structure of a protein. In one aspect, a method comprises: obtaining, for each of a plurality of proteins, a full multiple sequence alignment for the protein; generating, for each of the plurality of proteins, target structure parameters characterizing a structure of the protein from the full multiple sequence alignment for the protein, comprising processing a representation of the full multiple sequence alignment for the protein using the structure prediction neural network to generate output structure parameters characterizing a structure of the protein, and determining the target structure parameters for the protein based on the output structure parameters for the protein; determining, for each of the plurality of proteins, a reduced multiple sequence alignment for the protein, comprising removing or masking data from the full multiple sequence alignment for the protein.

IPC Classes ?

G16B 40/20 - Supervised data analysis
G16B 15/20 - Protein or domain folding
G16B 30/10 - Sequence alignment; Homology search
G06N 3/045 - Combinations of networks

100. LANGUAGE MODEL FOR PROCESSING A MULTI-MODE QUERY INPUT

Application Number	18141337
Status	Pending
Filing Date	2023-04-28
First Publication Date	2023-11-02
Owner	DeepMind Technologies Limited (United Kingdom)
Inventor	Alayrac, Jean-Baptiste Donahue, Jeffrey Lenc, Karel Simonyan, Karen Reynolds, Malcolm Kevin Campbell Luc, Pauline Mensch, Arthur Barr, Iain Miech, Antoine Hasson, Yana Elizabeth Millican, Katherine Elizabeth Ring, Roman

Abstract

A query processing system is described which receives a query input comprising an input token string and also at least one data item having a second, different modality, and generates a corresponding output token string.

IPC Classes ?

G06F 16/432 - Query formulation
G06F 40/284 - Lexical analysis, e.g. tokenisation or collocates
G06F 16/438 - Presentation of query results

1 2 3 ... 8 Next Page