Various arrangements for performing wireless device-to-device communication are presented. An audio output device, such as an earbud or pair of earbuds, can establish a connection with an audio source via a first Bluetooth interface that communicates using a Bluetooth communication protocol on a 2.4 GHz Bluetooth frequency band. The audio output device can negotiate that Bluetooth frequency-shifted communication, such as on a 5 or 6 GHz frequency band, is available for use with the audio source. The audio output device may then perform Bluetooth frequency-shifted communication with the audio source such that the audio output device receives an audio stream from the audio source using Bluetooth frequency-shifted communication and the Bluetooth communication protocol.
Mitigating latency in guiding a user, during an interaction between the user and a computing system, in selecting a subset of item(s), from a superset of candidate items, and causing performance of further action(s) based on the selected subset of item(s). In guiding a user in selecting the subset of items, various implementations enable the user to provide only spoken input(s) in selecting the subset of item(s), and provide visual output(s) that are responsive to the spoken input(s) and that guide the user in selecting the item(s). In some of those various implementations, there is not any (or there is only de minimis) audible spoken synthesized spoken output rendered by the computing system in guiding the user in selecting the subset of item(s).
Systems and methods for textual replacement can include the determination of a visual intent, which can trigger an interface for selecting an image to replace visual descriptors. The visually descriptive terms can be identified, and an indicator can be provided to indicate the text replacement option may be initiated. An image can then be selected by a user to replace the visually descriptive terms.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for language models using domain-specific model components. In some implementations, context data for an utterance is obtained. A domain-specific model component is selected from among multiple domain-specific model components of a language model based on the non-linguistic context of the utterance. A score for a candidate transcription for the utterance is generated using the selected domain-specific model component and a baseline model component of the language model that is domain-independent. A transcription for the utterance is determined using the score the transcription is provided as output of an automated speech recognition system.
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for parameterization of physical dimensions of discrete circuit components for component definitions that define discrete circuit components. The component definitions may be selected for use in a device design. When a parametrization of a particular version of a discrete circuit component definition is changed, the version level of the device design is also changed and the circuit layout for the device design is physically verified for the new version level.
Implementations related to facilitating continued conversations of a user with an automated assistant when the user changes locations relative to one or more devices in an ecosystem of linked assistant devices. The user initially invokes a first device and provides a request, which is processed by the first device. The first device provides a notification to one or more other devices in the ecosystem to indicate that the user is likely to issue a further assistant request. The first device processes subsequent audio data to determine whether the subsequent audio data includes a further assistant request. The one or more other notified devices process device-specific sensor data to determine whether the user is co-present with the one of the other devices. If the user presence is detected, an indication is provided to the first device, causing the first device to cease processing subsequent audio data. Further, the co-present device starts to process subsequent audio data.
The subject matter described in this disclosure includes a pixel circuit with an LED and a driving transistor having a drain terminal that is connected to the LED to supply power to the LED. The pixel circuit also includes a second transistor that is connected between the LED and an initialization voltage line, the second transistor having a gate terminal connected to a scan line. The pixel circuit also includes a third transistor that is connected between the LED and the initialization voltage line in series with the second transistor, the third transistor having a gate terminal connected to a reset line. The pixel circuit is configured so that activating the scan line at a first frequency and activating the reset line at half the first frequency causes the LED to be initialized every other time the scan line is activated.
G09G 3/3233 - Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix using controlled light sources using electroluminescent panels semiconductive, e.g. using light-emitting diodes [LED] organic, e.g. using organic light-emitting diodes [OLED] using an active matrix with pixel circuitry controlling the current through the light-emitting element
8.
PARAMETERIZED NOISE SYNTHESIS FOR GRAPHICAL ARTIFACT REMOVAL
Pre-encoding noise parameterization techniques mitigate or eliminate banding and other graphical artifacts in video frames for decoding and presentation by a client device. For one or more input video frames, a quantization parameter associated with the input video frames is identified. Noise synthesis parameters are determined based on the identified quantization parameter, and the input video frames are encoded for transmission. The encoded video frames are transmitted to the client device along with the determined noise synthesis parameters, for use by the client device in generating synthetic noise to add to resulting video frames decoded by the client device.
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
9.
DETERMINING ATTRIBUTES FOR ELEMENTS OF DISPLAYABLE CONTENT AND ADDING THEM TO AN ACCESSIBILITY TREE
A method may receive an image representing displayable content for display by an application. A method may execute a layout extraction model using the image as input and generating a list of elements for the image as output, the list of elements including at least a bounding box defining a portion of the image and a role attribute. A method may add the role attribute to a node in an accessibility tree using the list of elements.
Techniques are described herein for providing people suggestions in collaborative online text editors. A method includes: receiving user interface input that corresponds to a document in a document editing application; automatically parsing the received user interface input to identify a name included in the user interface input; in response to identifying the name included in the user interface input, providing an option to create a link in the document between the name and a corresponding contact in a contact store; receiving additional user interface input that indicates acceptance of the option to create the link in the document; and in response to receiving the additional user interface input, automatically creating the link in the document between the name and the corresponding contact in the contact store.
The wait time prediction technology determines expected wait times for businesses or other public services using a model generated based on at least historical wait times for the business. In response to a request from a user, an expected wait time for service at the business for at least one particular time period on a particular day of a week is determined using the model and provided for display. User feedback regarding the expected wait time may be requested, and used to refresh the model as new wait times and other information are collected.
G06Q 10/04 - Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
G06Q 10/109 - Time management, e.g. calendars, reminders, meetings or time accounting
G06Q 30/0201 - Market modelling; Market analysis; Collecting market data
G06Q 30/0202 - Market predictions or forecasting for commercial activities
12.
AUTOMATED ASSISTANT THAT UTILIZES RADAR DATA TO DETERMINE USER PRESENCE AND VIRTUALLY SEGMENT AN ENVIRONMENT
Implementations relate to an automated assistant that can determine whether to respond to inputs in an environment according to whether radar data indicates a user is present. When user presence is detected, the automated assistant can virtually segment the environment and apply certain operational parameters to certain segments of the environment. For instance, the automated assistant can enable an input detection feature, such as warm word detection, for a segmented portion of the environment in which a user is detected. In this way, false positives can be mitigated for instances in which environmental and/or user sounds are detected by the automated assistant but do not originate from a particular segment of the environment. Other parameters, such as varying confidence thresholds and/or speech processing biasing, can be temporarily enforced for different segments of an environment in which a user is detected.
The present disclosure is directed to selective gesture recognition for handheld device gestures. An example method includes receiving, by a handheld interactive object, movement information descriptive of a gesture performed with the handheld interactive object. The method includes selecting a local and/or remote machine-learned model for processing the movement information. The movement information can be processed to identify a gesture action corresponding to the movement information. The local and/or remote machine-learned model can be selected based on user input data and/or a complexity of the movement information. In response to selecting the local machine-learned model, the method includes processing the movement information according to the local machine-learned model and communicating a message to a remote device based on the result. In response to selecting the remote ma-chine-learned model, the method includes communicating the movement information to the remote device for processing in accordance with the remote machine-learned model.
G06F 3/038 - Control and interface arrangements therefor, e.g. drivers or device-embedded control circuitry
G06F 3/01 - Input arrangements or combined input and output arrangements for interaction between user and computer
G06F 3/0346 - Pointing devices displaced or positioned by the user; Accessories therefor with detection of the device orientation or free movement in a 3D space, e.g. 3D mice, 6-DOF [six degrees of freedom] pointers using gyroscopes, accelerometers or tilt-sensors
14.
Automatic Audio Playback of Displayed Textual Content
An audio playback system that provides intuitive audio playback of textual content responsive to user input actions, such as scrolling portions of textual content on a display. Playback of audio (e.g., text-to-speech audio) that includes textual content can begin based on a portion of textual content being positioned by a user input at a certain position on a device display. As one example, a user can simply scroll through a webpage or other content item to cause a text-to-speech system to perform audio playback of textual content displayed in one or more playback section(s) of the device's viewport (e.g., rather than requiring the user to perform additional tapping or gesturing to specifically select a certain portion of textual content).
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
This document describes techniques and systems for providing trusted computing for digital devices. The techniques and systems may use cryptographic algorithms to provide trusted computing and processing. By doing so, the techniques help ensure authentic computation and prevent nefarious acts. For example, a method is described that receives a signature associated with a designee and validates the signature. The signature may be associated with a designee of a host computing device, and the signature may be generated according to firmware associated with an integrated circuit of the host computing device and a first private key of a first asymmetric key pair. Signature validation may be based on a second asymmetric key pair having a second private key and a second public key, the second private key stored in write-once memory of the host computing device.
G06F 21/57 - Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
A waveguide includes an outcoupler with a dual reflective facet configuration. The dual reflective facet configuration includes a first set of reflective facets to receive light from a first direction and reflect the light incident thereon to an outcoupling direction. The dual reflective facet configuration also includes a second set of reflective facets to receive light from a second direction and reflect the light incident thereon to the outcoupling direction.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating speech from text. One of the systems includes one or more computers and one or more storage devices storing instructions that when executed by one or more computers cause the one or more computers to implement: a sequence-to-sequence recurrent neural network configured to: receive a sequence of characters in a particular natural language, and process the sequence of characters to generate a spectrogram of a verbal utterance of the sequence of characters in the particular natural language; and a subsystem configured to: receive the sequence of characters in the particular natural language, and provide the sequence of characters as input to the sequence-to-sequence recurrent neural network to obtain as output the spectrogram of the verbal utterance of the sequence of characters in the particular natural language.
G10L 13/08 - Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
G06N 3/084 - Backpropagation, e.g. using gradient descent
G10L 13/04 - Methods for producing synthetic speech; Speech synthesisers - Details of speech synthesis systems, e.g. synthesiser structure or memory management
G10L 15/16 - Speech classification or search using artificial neural networks
G10L 25/18 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
G10L 25/30 - Speech or voice analysis techniques not restricted to a single one of groups characterised by the analysis technique using neural networks
18.
MAGIC STATE FACTORY CONSTRUCTIONS FOR PRODUCING CCZ AND T STATES
Methods, systems, and apparatus for producing CCZ states and T states. In one aspect, a method for transforming a CCZ state into three T states includes obtaining a first target qubit, a second target qubit and a third target qubit in a CCZ state; performing a X−1/2 gate on the third target qubit; performing an X gate on the first target qubit and the second target qubit using the third target qubit as a control; performing a Z gate on the first target qubit and the second target qubit using the third qubit as a X axis control; performing a Z−1/4 gate on the third target qubit; and performing a Z gate on the first target qubit and the second target qubit using the third qubit as a X axis control to obtain the three T states.
G06N 10/20 - Models of quantum computing, e.g. quantum circuits or universal quantum computers
G06N 10/40 - Physical realisations or architectures of quantum processors or components for manipulating qubits, e.g. qubit coupling or qubit control
H03K 19/003 - Modifications for increasing the reliability
H03M 13/00 - Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
H03M 13/03 - Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
H03M 13/29 - Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training a neural network using a priority queue. One of the methods includes maintaining data identifying a set of K output sequences that were previously generated; selecting at least one of the output sequences from the set of output sequences; for each selected output sequence, determining a respective score; determining, for each selected sequence, a respective first update to the current values of the controller parameters; generating a batch of new output sequences using the controller neural network; obtaining a respective reward for each of the new output sequences; determining, from the new output sequences and the output sequences in the maintained data, the K output sequences that have the highest rewards; and modifying the maintained data.
Aspects of the disclosure relate to providing a secure collaboration between one or more PCIe accelerators and an enclave. An example system may include a PCIe accelerator apparatus. The PCIs accelerator apparatus may include the one or more PCIe accelerators and a microcontroller configured to provide a cryptographic identity to the PCIe accelerator apparatus. The PCIe accelerator apparatus may be configured to use the cryptographic identity to establish communication between the PCIe accelerator apparatus the enclave.
G06F 21/72 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure computing or processing of information in cryptographic circuits
G06F 13/42 - Bus transfer protocol, e.g. handshake; Synchronisation
G06F 21/79 - Protecting specific internal or peripheral components, in which the protection of a component leads to protection of the entire computer to assure secure storage of data in semiconductor storage media, e.g. directly-addressable memories
21.
AUTOMATED ASSISTANT THAT UTILIZES RADAR DATA TO DETERMINE USER PRESENCE AND VIRTUALLY SEGMENT AN ENVIRONMENT
Implementations relate to an automated assistant that can determine whether to respond to inputs in an environment according to whether radar data indicates a user is present. When user presence is detected, the automated assistant can virtually segment the environment and apply certain operational parameters to certain segments of the environment. For instance, the automated assistant can enable an input detection feature, such as warm word detection, for a segmented portion of the environment in which a user is detected. In this way, false positives can be mitigated for instances in which environmental and/or user sounds are detected by the automated assistant but do not originate from a particular segment of the environment. Other parameters, such as varying confidence thresholds and/or speech processing biasing, can be temporarily enforced for different segments of an environment in which a user is detected.
A UE (102) receives (305), from a network entity (106a), a configuration for a TCI state list for a first serving cell that corresponds to a reference cell. The UE (102) further receives (309), from the network entity (106a), a first parameter and a second parameter that define an other TCI state list for a second serving cell. The first parameter indicates a type of the other TCI state list for the second serving cell and the second parameter indicates at least one of: a serving cell index for the first serving cell or one or more TCI state IDs in the TCI state list. The UE (102) communicates (377) with the network entity (106a) using the other TCI state list for the second serving cell. The other TCI state list is based on the second parameter and the TCI state list for the first serving cell.
A computing device may drive a haptic device of the computing device to output a precursor haptic signal. The computing device may determine a motion signal associated with outputting the precursor haptic signal, lire computing device may determine, based at least in part on the motion signal associated with outputting the precursor haptic signal, that the computing device is in an adverse haptic environment. The computing device may, in response to determining that the computing device is in an adverse haptic environment, drive, by the one or more processors, the haptic device to output an alternative haptic signal instead of the haptic signal.
An enhanced matrix product state-based decoder is generated and employed to almost optimally detect and correct errors within a quantum computing and information processing system. The decoder takes as input a detector level error model that describes physical error channels and a set of error detections. This error model is improved using experimental data.
25.
DUAL BAND WIRELESS COMMUNICATIONS FOR MULTIPLE CONCURRENT AUDIO STREAMS
Various arrangements for performing wireless device-to-device communication are presented. An audio output device, such as an earbud or pair of earbuds, can establish a connection with an audio source via a first Bluetooth interface that communicates using a Bluetooth communication protocol on a 2.4 GHz Bluetooth frequency band. The audio output device can negotiate that Bluetooth frequency-shifted communication, such as on a 5 or 6 GHz frequency band, is available for use with the audio source. The audio output device may then perform Bluetooth frequency-shifted communication with the audio source such that the audio output device receives an audio stream from the audio source using Bluetooth frequency-shifted communication and the Bluetooth communication protocol.
H04W 4/80 - Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication
Filter coefficient derivation simplification for cross-component prediction reduces latencies typically introduced by convolutional cross-component model (CCCM) prediction and thus enables use of CCCM prediction by hardware coders. Various approaches for filter coefficient derivation simplification are disclosed, including limiting a dynamic range of filter coefficient derivation to a defined bit range, limiting filter coefficient derivation and thus use of CCCM prediction based on coding unit size, and/or enabling filter coefficient derivation directly from non-downsampled luma samples.
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/157 - Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
H04N 19/593 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
Region-based cross-component prediction improves convolutional cross-component mode (CCCM) prediction by enabling filter coefficients for predicting chroma samples from luma samples to be derived for an entire region of a frame of a video stream, such as a coding tree unit (CTU), rather than requiring that such filter coefficients be derived for each individual coding unit (CU). Deriving the filter coefficients for an entire region instead of for each individual CU under processing significantly reduces the latency in video coding and thus enables CCCM prediction to be used in hardware coder implementations.
H04N 19/105 - Selection of the reference unit for prediction within a chosen coding or prediction mode, e.g. adaptive choice of position and number of pixels used for prediction
H04N 19/157 - Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
H04N 19/593 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using predictive coding involving spatial prediction techniques
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
28.
TRANSLATION AND SCALING EQUIVARIANT SLOT ATTENTION
A method includes receiving feature vectors and, for each respective feature vector, a corresponding absolute positional encoding. The method also includes determining latent representations of entities represented by the feature vectors, and determining, for each respective latent representation, a corresponding relative positional encoding based on the corresponding absolute positional encoding of each feature vector and a corresponding position vector associated with the respective latent representation. The method additionally includes determining an attention matrix based on the feature vectors, the entity-centric latent representations, and the corresponding relative positional encoding of each latent representation. The method further includes updating, for each respective latent representation, the corresponding position vector based on a weighted mean of the corresponding absolute positional encoding of each feature vector weighted according to corresponding entries of the attention matrix, and outputting the latent representations and/or the position vectors associated therewith.
G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
A waveguide includes a plurality of reflective facet sets. Each reflective facet set of the plurality of reflective facet sets includes a first reflective facet to reflect light having a first optical characteristic and a second reflective facet to reflect light having a second optical characteristic that is different from the first optical characteristic. A first reflective facet in a first reflective facet set of the plurality of reflective facet sets overlaps a first reflective facet of a second set of the plurality of reflective facet sets.
A method (400) for an aggregatable application programming interface (API) includes receiving, from a third party service (150), an aggregation request (20) requesting aggregation of client data (30) from a client (12) of the third party service. The method also includes receiving, from an API (14) executed by a client device (10) of the client, a first portion of the client data (30a). The method includes storing the first portion of the client data and receiving, from the API, a second portion of the client data (30b). The method includes determining that the second portion of the client data is a final portion of the client data. In response, the method includes aggregating the first portion of the client data with the second portion of the client data. The method also includes transmitting the aggregated client data (30A) to the third party service.
Decoding a current block includes receiving a compressed bitstream. A transform block of transform coefficients is decoded from the compressed bitstream. The transform coefficients are in a transform domain. The transform block is input to a machine-learning model to obtain a residual block that is in a pixel domain. The residual block is used to reconstruct the current block. Encoding a current block includes receiving a current residual block. The current residual block and a specified rate-distortion parameter are input to a machine-learning model to obtain a quantized transform block. The quantized transform block is entropy encoded into a compressed bitstream.
H04N 19/107 - Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
H04N 19/147 - Data rate or code amount at the encoder output according to rate distortion criteria
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/18 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a set of transform coefficients
H04N 19/91 - Entropy coding, e.g. variable length coding [VLC] or arithmetic coding
A method includes receiving a first facial framework and a first captured image of a face. The first facial framework corresponds to the face at a first frame and includes a first facial mesh of facial information. The method also includes projecting the first captured image onto the first facial framework and determining a facial texture corresponding to the face based on the projected first captured image. The method also includes receiving a second facial framework at a second frame that includes a second facial mesh of facial information and updating the facial texture based on the received second facial framework. The method also includes displaying the updated facial texture as a three-dimensional avatar. The three-dimensional avatar corresponds to a virtual representation of the face.
Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for wireless charging using time-division multiplexing. In some implementations, a wireless charger is configured to concurrently charge multiple devices by providing power wirelessly to individual devices in different time periods. The wireless charger can perform time-division multiplexing by selectively directing the output of a single driver circuit to different power transfer coil segments at different times. The wireless charging sessions of multiple devices can be maintained by repeating a pattern of activating different power transfer coil segments one by one in successive time periods.
H02J 50/90 - Circuit arrangements or systems for wireless supply or distribution of electric power involving detection or optimisation of position, e.g. alignment
H02J 7/00 - Circuit arrangements for charging or depolarising batteries or for supplying loads from batteries
H02J 50/12 - Circuit arrangements or systems for wireless supply or distribution of electric power using inductive coupling of the resonant type
H02J 50/40 - Circuit arrangements or systems for wireless supply or distribution of electric power using two or more transmitting or receiving devices
36.
Transferable Neural Architecture for Structured Data Extraction From Web Documents
Systems and methods for efficiently identifying and extracting machine-actionable structured data from web documents are provided. The technology employs neural network architectures which process the raw HTML content of a set of seed websites to create transferrable models regarding information of interest. These models can then be applied to the raw HTML of other websites to identify similar information of interest. Data can thus be extracted across multiple websites in a functional, structured form that allows it to be used further by a processing system.
Example embodiments of the present disclosure provide for an example method. The example method includes generating an initial user interface including a content assistant component. The example method include obtaining user input data. The example method includes processing, by a machine learned model interfacing with the content assistant component, the data indicative of the input received from the user. The method includes obtaining output data, from the machine learned model interfacing with the content assistant component, indicative of one or more content item components. The method includes transmitting data which causes the content item components to be provided for display via an updated user interface. The method includes obtaining data indicative of user selection of approval of the content item components. The method includes generating, in response to obtaining the data indicative of the user selection of the approval of the content item components, content items.
Example embodiments of the present disclosure provide for an example method that includes obtaining via a conversational campaign assistant interface, by a custom language model, natural language input. The method includes generating, by the custom language model, an output comprising a predicted user intent. The method includes determining actions to perform and determining a natural language response. The method includes transmitting, to an action component, the action data structure comprising executable instructions that cause the action component to automatically perform operations associated with completing the action. The method includes transmitting to the conversation campaign assistant interface, the response data structure comprising the natural language response to be provided for display to a user via the conversational campaign assistant interface. The method includes obtaining user input indicative of a validation of the action data structure or the response data structure and updating the custom language model based on the user input.
Various arrangements are presented that provide improvements of short-range wireless communications, such as Bluetooth LE Audio communication. An audio source device may determine that unidirectional audio is to be output. In response to determining that unidirectional audio is to be output, a first physical layer (PHY) configuration can be set for a first communication link in the downlink direction from the audio source device to the audio output device. A second PHY configuration can be set for the communication link in the uplink direction from the audio output device to the audio source device. The first PHY configuration has a greater symbol rate than the second PHY configuration.
H04W 4/80 - Services using short range communication, e.g. near-field communication [NFC], radio-frequency identification [RFID] or low energy communication
40.
INITIALIZING A CONVERSATION WITH AN AUTOMATED AGENT VIA SELECTABLE GRAPHICAL ELEMENT
Methods, apparatus, systems, and computer-readable media are provided for using selectable elements to invoke an automated assistant at a computing device. While operating the computing device, a user may not be aware that the automated assistant can be invoked according to certain invocation phrases. In order to inform the user of the functionality of the automated assistant, the user can be presented with selectable elements that can initialize the automated assistant when selected. Furthermore, a selectable element can provide an invocation phrase in textual form so that the user is aware of their ability to invoke the automated assistant by speaking the invocation phrase. The selectable element can be presented at different devices associated with the user, and the automated assistant can be initialized at a device that is separate from the device where the selectable element is presented.
G06F 9/451 - Execution arrangements for user interfaces
G06F 3/04812 - Interaction techniques based on cursor appearance or behaviour, e.g. being affected by the presence of displayed objects
G06F 3/0488 - Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
A first wireless device employs an uplink (UL) pre-transmission process to temporarily buffer data for processing prior to transmission of the resulting processed data to a second wireless device. To mitigate excessive delay of higher-priority data, higher-priority data is enqueued into the UL pre-transmission process without restriction (subject to capacity limitations), while lower-priority data is selectively enqueued into the UL pre-transmission process based on one or more criteria applied to a current volume of data in the input queue. Further, the first wireless device monitors the current transmission efficiency based on, for example, the current usage of transmission padding, and operates to dynamically adjust one or more of the criteria based on the monitored current transmission efficiency.
An apparatus for handling electronic components such as hard disk drives. In one aspect, the apparatus includes a main body defining an interior space with an open front; a drive system that propels and positions the apparatus along horizontal surface; a fan system mounted within the interior space and positioned to blow air down into the interior space; a first gripper apparatus that engages an equipment drawer of an electronics rack; and a second gripper apparatus that grips and removes a target electronic component from a target position located within the equipment drawer, wherein at least a back surface of the main body includes perforations so that sufficient air flow generated by the fan system flows through the perforations to maintain cooling of electronic components in the equipment drawer when the equipment drawer is in the extracted position.
Methods and apparatus related to associating a task with a user based on the user selecting a task suggestion that is provided to the user in response to a user query. In some implementations, the task may be identified based on similarities between the words and/or phrases of the user query and a task suggestion that is associated with a task. In some implementations, the task may be identified based on user data associated with the user. In some implementations, the task may be associated with additional information related to completing the task.
A method for detecting network anomalies includes receiving a control message from a cellular network and extracting one or more features from the control message. The method also includes predicting a potential label for the control message using a predictive model configured to receive the one or more extracted features from the control message as feature inputs. Here, the predictive model is trained on a set of training control messages where each training control message includes one or more corresponding features and an actual label. The method further includes determining that a probability of the potential label satisfies a confidence threshold. The method also includes analyzing the control message to determine whether the control message corresponds to a respective network performance issue. When the control message impacts network performance, the method includes communicating the network performance issue to a network entity responsible for the network performance issue.
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling operations represented on a computation graph. One of the methods receiving, by a computation graph system, a request to generate a schedule for processing a computation graph, obtaining data representing the computation graph generating a separator of the computation graph; and generating the schedule to perform the operations represented in the computation graph, wherein generating the schedule comprises: initializing the schedule with zero nodes; for each node in the separator: determining whether the node has any predecessor nodes in the computation graph, when the node has any predecessor nodes, adding the predecessor nodes to the schedule, and adding the node in the schedule, and adding to the schedule each node in each subgraph that is not a predecessor to any node in the separator on the computation graph.
An actuator module includes a baseplate extending in a plane, a voice coil connected to the baseplate, and a magnet assembly. The actuator module also includes a rigid frame attached to the baseplate, the rigid frame comprising four stubs. The actuator module further includes a pair of springs suspending the magnet assembly relative to the frame and baseplate so that the voice coil extends into the air gap, the pair of springs including a first and second spring each shaped as a loop defining an aperture sized to accommodate motion of the magnet assembly along a direction of the coil axis, the first spring being attached to the frame at a first pair of the four stubs, the second spring being attached to the frame at a second pair of the four stubs, and both being attached to separate portions of the magnet assembly.
The technology relates to methods and systems for implicit calibration for gaze tracking. This can include receiving, by a neural network module, display content that is associated with presentation on a display screen (1202). The neural network module may also receive uncalibrated gaze information, in which the uncalibrated gaze information includes an uncalibrated gaze trajectory that is associated with a viewer gaze of the display content on the display screen (1204). A selected function is applied by the neural network module to the uncalibrated gaze information and the display content to generate a user-specific gaze function (1206). The user-specific gaze function has one or more personalized parameters. And the neural network module can then apply the user-specific gaze function to the uncalibrated gaze information to generate calibrated gaze information associated with the display content on the display screen (1208). Training and testing information may alternatively be created for implicit gaze calibration (1000).
Methods for creating a live copy of a data object from a production system for use by third party applications include receiving at least one request for a copy of production data from an application; creating a live backup copy; creating a flash copy of the live backup copy, and a flash copy bitmap; creating a modified version of the live backup copy by changing a subset of data in the live backup copy; recording the changed subset of data using the flash copy bitmap; mounting, the modified version of the live backup copy to the application; and transforming the modified version of the live backup copy back to the live backup copy when unmounting the modified version of the live backup copy of the production data from the application by applying changes associated with the flash copy bitmap to the live backup copy.
G06F 11/14 - Error detection or correction of the data by redundancy in operation, e.g. by using different operation sequences leading to the same result
A method can include selecting, from at least a first avatar and a second avatar based on at least one attribute of a calendar event associated with a user, a session avatar, the first avatar being based on a first set of images of a user wearing a first outfit and the second avatar being based on a second set of images of the user wearing a second outfit, and presenting the session avatar during a videoconference, the presentation of the session avatar changing based on audio input received from the user during the videoconference.
Systems and methods method for performing captioning for image or video data are described herein. The method can include receiving unlabeled multimedia data, and outputting, from a machine learning model, one or more captions for the multimedia data. Training the machine learning model to create these outputs can include inputting a subset of video frames and a first utterance into the machine learning model, using the machine learning model to predict a predicted utterance based on the subset of video frames and the first utterance, and updating one or more parameters of the machine learning model based on a loss function that compares the predicted utterance with the second utterance.
Implementations relate to an automated assistant that provides augmented reality content, via a display interface of computerized glasses, resulting from post-processing of application content. The application content can be identified based on prior interactions between a user and one or more applications, and the application content can be processed to determine objects, and/or object classifications, that may be associated with the application content. When the user is wearing the computerized glasses, and the object is detected within a field of view of the computerized glasses, the automated assistant can cause certain content to be rendered at the display interface of the computerized glasses. In some implementations, the content can be generated to supplement, and/or be different from, existing content that the user may have already accessed, in furtherance of preventing duplicative usage of applications and/or preserving computational resources.
Implementations relate to an automated assistant that can proactively detect and respond to a request for credentials. Characteristics of an entity requesting the credentials can be preemptively determined by the automated assistant using data that may be provided by the user or other previous visitors to a location. For example, the automated assistant can determine that the entity may expressly request certain information from a user when the user arrives at the location. Based on this determination, the automated assistant can operate to initialize an interface of a computing device of the user, when the user is determined to be at or near the location. For example, an audio interface of the computing device can be initialized to capture an audible request from a person who views credentials before granting access to a feature of the location.
G06F 16/909 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
53.
A GENERALIST FRAMEWORK FOR PANOPTIC SEGMENTATION OF IMAGES AND VIDEOS
Provided are systems and methods for performing panoptic segmentation of images and videos using a denoising diffusion model. The panoptic segmentation task is formulated as a conditional discrete data generation problem. This is achieved by learning a generative model for panoptic masks, for example treated as an array of discrete tokens, conditioned on an input image. The generative model can also be applied to video data by including predictions from past frames as an additional conditioning signal. This enables the model to learn to track and segment objects automatically across video frames.
Methods and techniques for manipulating the color of an image based on a text-based description are presented herein. A system can access an input image and an input text. The system can process, using a machine-learned recolorizing model, the input image to generate a recolorized image. A system can determine the similarity between the recolorized image and the input text description using a loss function and pre-trained encoder(s) which have been trained on a large dataset of text and images to convert the text and image inputs into the same embedding space. The system can then modify the one or more parameter values of the machine-learned recolorizing model to minimize the value of the loss function. Thus, after a plurality of iterations, the machine-learned recolorizing model will generate a recolorized photo that matches the description given in the input text.
Systems and method for routing data packets in ring network. A data packet being transmitted to a destination node may be received by a first structure at a first node. The first node may determine a number of hops the data packet will traverse as it is transmitted from the first node to the destination node and compare the determined number of hops to a threshold hop value to determine whether the number of hops is equal to or less than the threshold hop value. If the number of hops is greater than the threshold, the data packet may be transmitted to a dimension queuing structure for a first virtual channel within a second node, otherwise, the data packet may be transmitted to a dimension queuing structure for a second virtual channel or a turn queuing structure within the second node.
Systems and methods provided for restricting SN status changes. UE generates measurement report B1. Measurement report B1 would, notwithstanding SN status change, trigger SN addition procedure. Determination is made whether the flag within the eNB/gNB for "Restrict-secondary-node-addition" is set to true. If determination is no, then no SN status change restriction is permitted for UE and process proceeds to perform secondary node addition. If determination is yes, determination is made whether the UE has been in connected mode within this cell for time greater than the "time-threshold-for-reject-secondary node-addition" parameter value. If determination is no, perform secondary node addition. If determination is yes, determination is made whether the UE has requested secondary node addition a number of times greater than "number-threshold-for-secondary-node-rejection". If determination is no, performs secondary node addition. If determination is yes, i.e., the eNB/gNB MN determines to not proceed with the secondary node addition request for the UE-reported measurement.
Decoding a current block using inter prediction with filtering includes identifying an intermediate prediction block for the current block using a motion vector and a reference frame. Filter coefficients are obtained for a filter. The filter coefficients are obtained using reconstructed pixels and second reconstructed pixels. The reconstructed pixels are peripheral to the current block. The second reconstructed pixels are peripheral to the intermediate prediction block. The filter is applied to the intermediate prediction block to obtain a final prediction block. The current block is reconstructed using the final prediction block. Encoding a current block includes obtaining an intermediate motion vector for the current block. Filter coefficients are obtained by minimizing an error metric between a prediction block corresponding to the intermediate motion vector and the current block. A motion vector is obtained for the current block by refining the intermediate motion vector using the filter coefficients.
H04N 19/117 - Filters, e.g. for pre-processing or post-processing
H04N 19/137 - Motion inside a coding unit, e.g. average field, frame or block difference
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
H04N 19/196 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the adaptation method, adaptation tool or adaptation type used for the adaptive coding being specially adapted for the computation of encoding parameters, e.g. by averaging previously computed encoding parameters
H04N 19/46 - Embedding additional information in the video signal during the compression process
H04N 19/463 - Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
H04N 19/82 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals - Details of filtering operations specially adapted for video compression, e.g. for pixel interpolation involving filtering within a prediction loop
58.
COLOR DECORRELATION IN VIDEO AND IMAGE COMPRESSION
Image and video compression using color decorrelation is described. A method described herein includes receiving color transform information for an encoded block of image data, wherein the color transform information identifies an adaptive transform matrix used to convert an original block of the image data from an original color space to a new color space, thereby resulting in color decorrelation of the original block. A decoder receives a compressed bitstream including the encoded block that was encoded using the new color space and reconstructs the block from the encoded block. The method includes determining, from the color transform information, the adaptive transform matrix. After reconstructing the block, an inverse color transform of the block is performed using the matrix to obtain pixel values for a reconstructed block corresponding to the original block in the original color space, and the image data including the reconstructed block is stored or transmitted.
H04N 19/12 - Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
H04N 19/157 - Assigned coding mode, i.e. the coding mode being predefined or preselected to be further used for selection of another element or parameter
H04N 19/176 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
H04N 19/186 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being a colour or a chrominance component
H04N 19/463 - Embedding additional information in the video signal during the compression process by compressing encoding parameters before transmission
H04N 19/70 - Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by syntax aspects related to video coding, e.g. related to compression standards
59.
PHYSICAL LAYER IMPROVEMENTS FOR SHORT RANGE WIRELESS COMMUNICATIONS
Various arrangements are presented that provide improvements of short-range wireless communications, such as Bluetooth LE Audio communication. An audio source device may determine that unidirectional audio is to be output. In response to determining that unidirectional audio is to be output, a first physical layer (PHY) configuration can be set for a first communication link in the downlink direction from the audio source device to the audio output device. A second PHY configuration can be set for the communication link in the uplink direction from the audio output device to the audio source device. The first PHY configuration has a greater symbol rate than the second PHY configuration.
A method (500) includes, for each training sample (410) of a plurality of training samples: processing, using a sequence transduction model (200), corresponding training input features (415) to obtain one or more output token sequence hypotheses (432) each including one or more predicted common tokens (204); and determining a token-level loss (462) based on, for each hypothesis: a number of special token insertions each associated with a corresponding predicted special token that appears in the hypothesis but does not appear in a corresponding sequence of ground-truth output tokens; and a number of special token deletions each associated with a corresponding ground-truth special token in the set of ground-truth special tokens that does not appear in hypothesis. The method also includes training the sequence transduction model to minimize additive error rate based on the token-level losses determined for the plurality of training samples.
G10L 17/04 - Training, enrolment or model building
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for managing an interface between a pair of processing cores of a device that are configured to exchange data. The device is configured to enable or disable one or more of the pair of processing cores. One of the methods includes configuring a connect/disconnect interface implemented as logic circuitry between the pair of processing cores to assume a connected state in which the pair of processing cores and exchange data, and configuring the connect/disconnect interface between the pair of processing cores to assume a disconnected state in which one or more of the processing cores is unable to receive data.
A waveguide includes an outcoupler that is implemented in the waveguide as a set of reflective facets that is arranged along a first direction. Each reflective facet is made by applying a reflective coating to a planar face of one or more substrates. Adjacent reflective facets in the set of reflective facets overlap one another along the first direction. For example, a leading portion of one reflective facet in the set of reflective facets overlaps with a tailing portion of the reflective facet adjacent to it.
A method for accessing localized services of a hosting network is implemented in a user equipment (UE) associated with a home network. The method includes receiving an indication of whether the UE is to access the localized services of the hosting network via the hosting network or a serving network distinct from the hosting network (1212); and accessing the localized services in accordance with the indication (i) directly via a radio access network (RAN) of the hosting network (1240) or (ii) via a RAN of the serving network operating as an underlay network, and the hosting network operating as an overlay network (1242).
A method may receive an image representing displayable content for display by an application. A method may execute a layout extraction model using the image as input and generating a list of elements for the image as output, the list of elements including at least a bounding box defining a portion of the image and a role attribute. A method may add the role attribute to a node in an accessibility tree using the list of elements.
A method (500) includes receiving a sequence of acoustic frames (100) as input to a multilingual automated speech recognition (ASR) model (200) configured to recognize speech in a plurality of different supported languages and generating, by an audio encoder (204) of the multilingual ASR, a higher order feature representation (212, 222) for a corresponding acoustic frame. The method also includes generating, by a language identification (LID) predictor (230) of the multilingual ASR, a language prediction representation (232) for a corresponding higher order feature representation. The method also includes generating, by a decoder (240) of the multilingual ASR, a probability distribution (252) over possible speech recognition results based on the corresponding higher order feature representation, a sequence of non-blank symbols (121), and a corresponding language prediction representation. The decoder includes monolingual output layer (400) having a plurality of output nodes (410) each sharing a plurality of language-specific wordpiece models (420).
A method (600) includes obtaining a multi-utterance training sample (410) that includes audio data (412) characterizing utterances spoken by two or more different speakers (10) and obtaining ground-truth speaker change intervals (414) indicating time intervals in the audio data where speaker changes among the two or more different speakers occur. The method also includes processing the audio data to generate a sequence of predicted speaker change tokens (302) using a sequence transduction model (300). For each corresponding predicted speaker change token, the method includes labeling the corresponding predicted speaker change token as correct when the predicted speaker change token overlaps with one of the ground-truth speaker change intervals. The method also includes determining a precision metric (442) of the sequence transduction model based on a number of the predicted speaker change tokens labeled as correct and a total number of the predicted speaker change tokens.
G10L 17/04 - Training, enrolment or model building
G10L 15/06 - Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
G06N 3/0442 - Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
A non-transitory computer-readable storage medium may comprise instructions stored thereon. When executed by at least one processor, the instructions may be configured to cause a computing system to at least present a user interface of an application in association with a user account, the user interface including at least one fillable field, determine a content type of the at least one fillable field, search messages stored in association with the user account for a text string associated with the content type of the at least one fillable field, and fill the at least one fillable field with the text string.
G06F 3/0481 - Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
G06F 3/0484 - Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
G06F 16/958 - Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
Using a natural language (NL) latent presentation in the automated conversion of source code from a base programming language (e.g., C++) to a target programming language (e.g., Python). A base-to-NL model can be used to generate an NL latent representation by processing a base source code snippet in the base programming language. Further, an NL-to-target model can be used to generate a target source code snippet in the target programming language (that is functionally equivalent to the base source code snippet), by processing the NL latent representation. In some implementations, output(s) from the NL-to-target model indicate canonical representation(s) of variables, and in generating the target source code snippet, technique(s) are used to match those canonical representation(s) to variable(s) of the base source code snippet. In some implementations, multiple candidate target source code snippets are generated, and a subset (e.g., one) is selected based on evaluation(s).
(1) Providing online non-downloadable software for use in large language models and artificial intelligence; providing online non-downloadable software using artificial intelligence for the production of human speech and text; providing online non-downloadable software for natural language processing, generation, understanding and analysis; providing online non-downloadable software for artificial intelligence and machine-learning based language and speech processing software; providing online non-downloadable software for creating generative models; providing online non-downloadable software for generating speech, text, sound, code, videos, images, and sound input; research and development services in the field of artificial intelligence; research, development and evaluation of large language models and data sets; research, design and development of computer programs and software; providing online non-downloadable software for managing data sets and performing safety checks in the field of artificial intelligence; providing online non-downloadable software for multi-modal artificial intelligence and machine-learning based language, text, and speech processing software; providing temporary use of online non-downloadable software for facilitating multi-modal natural language, speech, text, sound, code, videos, images, and sound input; research and development services in the field of multi-modal computer natural language processing, artificial intelligence, and machine learning; providing temporary use of online non-downloadable software for an integrated development environment for large language models; providing online non-downloadable software for use in the fields of artificial intelligence, machine learning, natural language generation, statistical learning, mathematical learning, supervised learning, and unsupervised learning; providing information from searchable indexes and databases of information, including text, music, images, videos, software algorithms, mathematical equations, electronic documents, and databases; Software as a service (SAAS) featuring software for training software developers in the field of artificial intelligence.
42 - Scientific, technological and industrial services, research and design
Goods & Services
Providing online non-downloadable software for use in connection with large language models and artificial intelligence; Providing online non-downloadable software using artificial intelligence for the production of human speech, text, video, sound, music and images; Providing online non-downloadable software for natural language processing, generation, understanding and analysis; Providing online non-downloadable software for artificial intelligence and machine-learning based language, text, speech, video, sound, music and image processing software; Providing online non-downloadable software for transferring, viewing, sharing, and uploading text, speech, video, sound, music and images; Providing online non-downloadable software for generating and editing text, speech, video, sound, music and image outputs; Providing online non-downloadable software for generating and editing text, speech, video, sound, music and image input; Research and development services in the field of artificial intelligence; Providing online non-downloadable software for multi-modal artificial intelligence and machine-learning based language, text, speech, video, sound, music and image processing software; Providing online non-downloadable software for facilitating multi-modal natural language, text, speech, video, sound, music and image input; Research and development services in the field of multi-modal computer natural language processing, artificial intelligence, and machine learning; Providing online non-downloadable software for use in the fields of artificial intelligence, machine learning, and natural language generation.
42 - Scientific, technological and industrial services, research and design
Goods & Services
(1) Providing online non-downloadable software for use in connection with large language models and artificial intelligence; providing online non-downloadable software using artificial intelligence for the production of human speech, text, video, sound, music and images; providing online non-downloadable software for natural language processing, generation, understanding and analysis; providing online non-downloadable software for artificial intelligence and machine-learning based language, text, speech, video, sound, music and image processing software; providing online non-downloadable software for transferring, viewing, sharing, and uploading text, speech, video, sound, music and images; providing online non-downloadable software for generating and editing text, speech, video, sound, music and image outputs; providing online non-downloadable software for generating and editing text, speech, video, sound, music and image input; research and development services in the field of artificial intelligence; providing online non-downloadable software for multi-modal artificial intelligence and machine-learning based language, text, speech, video, sound, music and image processing software; providing online non-downloadable software for facilitating multi-modal natural language, text, speech, video, sound, music and image input; research and development services in the field of multi-modal computer natural language processing, artificial intelligence, and machine learning; providing online non-downloadable software for use in the fields of artificial intelligence, machine learning, and natural language generation.
Various arrangements of wireless earbuds are presented. A first earbud, can include a first speaker, a first processing system, and a first wireless communication interface, that communicates with an audio source device using Bluetooth communications. A second earbud can include a second speaker, a second processing system, and a second wireless communication interface, that communicates with the audio source device and the first earbud using Bluetooth communications. The first earbud and the second earbud may be configured to wirelessly communicate with each other following completion of a first connected isochronous stream (CIS) event for the first earbud and second CIS event for the second earbud within a connected isochronous group (CIG) event.
Various arrangements for short-range wireless communication are presented herein. An earbud of a pair of true wireless earbuds can receive an audio packet addressed to the other earbud of the pair. A single connected isochronous stream (CIS) within a connected isochronous group (CIG) may be present between the pair of true wireless earbuds and an audio source which transmitted the audio packet. The earbud can transmit a cross-acknowledgement indicating receipt of the audio packet to the other earbud. The earbud can also transmit audio data from the audio packet to the other earbud after the cross acknowledgement.
Methods and structures for providing thermal dissipating elements on integrated circuit (“IC”) dies are disclosed. A thermal dissipating element placement assembly, such as a pin fin placement assembly, along with a vacuum pickup assembly, can be used to assist with simultaneous placement of multiple pin fins with desired profiles on desired locations of the IC die. The pin fin placement assembly may be comprised of one or more plates with a plurality of apertures therein for receiving the pin fins. The pin fin placement assembly can be further incorporated into a thermal cooling structure, which can include a manifold configured to encase the IC die and attached pin fins.
Aspects of the disclosure provide a deep sequence model, referred to as Koopman Neural Forecaster (KNF), for time series forecasting. KNF leverages deep neural networks (DNNs) to learn the linear Koopman space and the coefficients of chosen measurement functions. KNF imposes appropriate inductive biases for improved robustness against distributional shifts, employing both a global operator to learn shared characteristics, and a local operator to capture changing dynamics, as well as a specially-designed feedback loop to continuously update the learnt operators over time for rapidly varying behaviors. KNF achieves superior performance on multiple time series datasets that are shown to suffer from distribution shifts.
Methods, systems, and apparatus for receiving, from a user, a request that includes an entity identifier associated with an entity that is referenced by one or more query terms of a search query, determining that the entity is identified in a media consumption database as a media item that has been indicated as consumed by the user or that the entity is associated with a media item that is identified in the media consumption database as a media item that has been indicated as consumed by the user, and based on the determination, providing a response to the request, the response including data indicating that the entity is a media item that has been indicated as consumed by the user or that the entity is associated with a media item that has been indicated as consumed by the user.
G06F 16/487 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
G06F 16/435 - Filtering based on additional data, e.g. user or group profiles
G06F 16/48 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
G06F 16/683 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
G06F 16/783 - Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
G06F 16/9535 - Search customisation based on user profiles and personalisation
G06F 16/955 - Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
G06Q 30/02 - Marketing; Price estimation or determination; Fundraising
Data indicating one or more verbal phrases provided by one or more participants during a conference call is fed as input to a machine learning model. One or more outputs of the machine learning model are obtained. A polling question for polling at least a portion of the participants is extracted from the one or more outputs of the machine learning model. The polling question is based on one or more verbal phrases provided by the one or more participants. The polling question is provided for polling the at least the portion of the participants during the conference call.
The disclosure is directed towards automatically tuning quantization-based approximate nearest neighbors (ANN) search methods and systems (e.g., search engines) to perform at the speed-recall pareto frontier. With a desired search cost or recall as input, the embodiments employ Lagrangian-based methods to perform constrained optimization on theoretically-grounded search cost and recall models. The resulting tunings, when paired with the efficient quantization-based ANN implementation of the embodiments, exhibit excellent performance on standard benchmarks while requiring minimal tuning or configuration complexity.
The embodiments are directed towards providing personalized federated learning (PFL) models via sharable federated basis models. A model architecture and learning algorithm for PFL models is disclosed. The embodiments learn a set of basis models, which can be combined layer by layer to form a personalized model for each client using specifically learned combination coefficients. The set of basis models are shared with each client of a set of the clients. Thus, the set of basis models is common to each client of the set of clients. However, each client may generate a unique PFL based on their specifically learned combination coefficients. The unique combination of coefficients for each client may be encoded in a separate personalized vector for each of the clients.
Provided is an approach that aligns multi-modal tokens using cross-attention without losing the advantages of global self-attention. In contrast to previous works that concatenate the unimodal tokens along the sequence dimension, example approaches described herein align per-modality tokens by chaining them along the channels. Specifically, the tokens from one modality can be used to query the other modality and the output can be concatenated with the query tokens on the channels. An analogous process can also be repeated (or performed in parallel) where the roles of the two modalities are switched. The resulting sets of compound tokens can be concatenated and fed into a self-attention encoder such as a transformer encoder that performs self-attention.
G06V 10/80 - Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
G06V 10/774 - Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for online training of machine learning models predicting time-series data. In one aspect, a method comprises training a machine learning model having a plurality of weights by maintaining weight data, specifying a plurality of sub-weights for each of the plurality of weights and covariance data that estimates the joint uncertainty between the sub-weights, and, at each of a plurality of time steps, receiving model inputs, processing the model inputs using the weight data to generate corresponding model outputs, receiving corresponding ground truth outputs, and updating the weight data using the corresponding ground truth outputs.
Methods, systems, and apparatuses to match content providers and interested content users are described. Input indicating an accessing of a network location by a user is received along with the user's identifier. The identifier is obfuscated and transmitted to a content provider configured to provide content to the user at the network location. A re-direct identifier is transmitted to the user instructing the user to directly contact the content provider. When the user contacts the content provider, the user transmits a provider-specific identifier by which the content provider identifies the user and the obfuscated user identifier. The content provider updates a database of obfuscated user identifiers and provider-specific user identifiers based on the received identifiers. Thus, the content provider is enabled to identify interested users based on obfuscated and provider-specific user identifiers.
The disclosure describes various aspects of using optical elements monolithically integrated with light-emitting diode (LED) structures. In an aspect, a light emitting device includes a single LED structure having an active region and a single optical element disposed on the LED structure and configured to collimate and steer light emitted by the LED structure. One or more additional optical elements may also be disposed on the LED structure. In another aspect, a light emitting device may include multiple LED structures and a single optical element disposed on the multiple LED structures and configured to collimate and steer light emitted by the multiple LED structures. For each of these aspects, the LED structure(s) and the optical element(s) are made of a material that includes GaN, the LED structure(s) has a corresponding active region, and the LED structure(s) has a corresponding reflective contact disposed opposite to the optical element(s).
H01L 33/00 - SEMICONDUCTOR DEVICES NOT COVERED BY CLASS - Details thereof
H01L 33/18 - SEMICONDUCTOR DEVICES NOT COVERED BY CLASS - Details thereof characterised by the semiconductor bodies with a particular crystal structure or orientation, e.g. polycrystalline, amorphous or porous within the light emitting region
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for feedback-directed optimization. One of the methods includes maintaining a data store comprising a plurality of optimization profiles that are used by a compiler to compile respective computer programs. The computer programs can be invoked by a set of executing workloads. Operations are repeatedly performed that include, for each optimization profile in at least a subset of the optimization profiles: determining or predicting whether the optimization profile is a valid optimization profile for a current software version of the compiler, and in response to determining or predicting that the optimization profile is not a valid optimization profile for the current software version of the compiler, removing the optimization profile from the data store.
The present disclosure provides directed to new, more efficient neural network architectures. As one example, in some implementations, the neural network architectures of the present disclosure can include a linear bottleneck layer positioned structurally prior to and/or after one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. As another example, in some implementations, the neural network architectures of the present disclosure can include one or more inverted residual blocks where the input and output of the inverted residual block are thin bottleneck layers, while an intermediate layer is an expanded representation. For example, the expanded representation can include one or more convolutional layers, such as, for example, one or more depthwise separable convolutional layers. A residual shortcut connection can exist between the thin bottleneck layers that play a role of an input and output of the inverted residual block.
Coordinating signal processing among computing devices in a voice-driven computing environment is provided. A first and second digital assistant can detect an input audio signal, perform a signal quality check, and provide indications that the first and second digital assistants are operational to process the input audio signal. A system can select the first digital assistant for further processing. The system can receive, from the first digital assistant, data packets including a command. The system can generate, for a network connected device selected from a plurality of network connected devices, an action data structure based on the data packets, and transmit the action data structure to the selected network connected device.
G10L 25/60 - Speech or voice analysis techniques not restricted to a single one of groups specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
A method for using a user-fillable form in a host container includes receiving, at a host container, a user-fillable form bound to dynamic data from an underlying data source where the user-fillable form has a data structure generated by prepopulated coding. The method further includes translating the user-fillable form into a hostable format for the host container. The method also includes rendering, using the hostable format for the host container, the user-fillable form in a user interface. The method further includes receiving, at the user interface of the host container, from a user of the host container, a data entry for input to the user-fillable form and updating, by the host container, the dynamic data from the underlying data source by persisting data from the data entry in a data store associated with the underlying data source.
Computer-implemented techniques can include obtaining, by a client computing device, a digital media item and a request for a processing task on the digital item and determining a set of operating parameters based on (i) available computing resources at the client computing device and (ii) a condition of a network. Based on the set of operating parameters, the client computing device or a server computing device can select one of a plurality of artificial neural networks (ANNs), each ANN defining which portions of the processing task are to be performed by the client and server computing devices. The client and server computing devices can coordinate processing of the processing task according to the selected ANN. The client computing device can also obtain final processing results corresponding to a final evaluation of the processing task and generate an output based on the final processing results.
Some implementations are directed to adapting a client application on a feature phone based on experiment parameters. Some of those implementations are directed to adapting an assistant client application, where the assistant client application interacts with remote assistant component(s) to provide automated assistant functionalities via the assistant client application of the feature phone. Some implementations are additionally or alternatively directed to determining whether an invocation, of an assistant client application on a feature phone, is a request for transcription of voice data received in conjunction with the invocation, or is instead a request for an assistant response that is responsive to the transcription of the voice data (e.g., includes assistant content that is based on and in addition to the transcription, and that optionally lacks the transcription itself).
A method includes detecting multiple users, receiving a first query issued by a first user, the first query including a command for a digital assistant to perform a first action, and enabling a round robin mode to control performance of actions commanded by queries. The method also includes, while performing the first action, receiving audio data corresponding to a second query including a command to perform a second action, performing speaker identification on the audio data, determining that the second query was spoken by the first user, preventing performing the second action, and prompting at least another user to issue a query. The method further includes receiving a third query issued by a second user, the third query including a command for the digital assistant to perform a third action, and when the digital assistant completes performing the first action, executing performance of the third action.
A waveguide including first and second sections has a first molded optic material forming a portion of the geometry of one or more Bragg gratings disposed on one surface of the first section of the waveguide. Similarly, a second molded optic material forming another portion of the geometry of one or more Bragg gratings is disposed on one surface of the second section of the waveguide. Further, a photopolymer material is deposited on the first molded optic material. As the first and second sections are coupled, a waveguide is formed with a layer of photopolymer material disposed in the waveguide with the layer of photopolymer material having a geometry defined by the first and second molded optic materials. Bragg grating holograms are then recorded in the layer of photopolymer material, resulting in a waveguide with a plurality of Bragg gratings.
Implementations relate to an automated assistant that provides augmented reality content, via a display interface of computerized glasses, resulting from post-processing of application content. The application content can be identified based on prior interactions between a user and one or more applications, and the application content can be processed to determine objects, and/or object classifications, that may be associated with the application content. When the user is wearing the computerized glasses, and the object is detected within a field of view of the computerized glasses, the automated assistant can cause certain content to be rendered at the display interface of the computerized glasses. In some implementations, the content can be generated to supplement, and/or be different from, existing content that the user may have already accessed, in furtherance of preventing duplicative usage of applications and/or preserving computational resources.
To determine locations of ultra-wideband (UWB) anchor devices in an indoor positioning system, the indoor positioning system obtains distance measurements between each pair of N UWB anchor devices in the indoor positioning system. Each distance measurement is determined based on a round trip time of a UWB signal communicated between the pair of UWB anchor devices. The indoor positioning system also determines a location of each of the UWB anchor devices within the indoor positioning system using the distance measurements, and reconstructs an absolute network topology of the UWB anchor devices using the determined locations of the UWB anchor devices. The absolute network topology is used to determine a location of a client device within the indoor positioning system.
G01S 5/02 - Position-fixing by co-ordinating two or more direction or position-line determinations; Position-fixing by co-ordinating two or more distance determinations using radio waves
A method for handling contradictory queries on a shared device includes receiving a first query issued by a first user, the first query specifying a first long-standing operation for a digital assistant to perform, and while the digital assistant is performing the first long-standing operation, receiving a second query, the second query specifying a second long-standing operation for the digital assistant to perform. The method also includes determining that the second query was issued by another user different than the first user and determining, using a query resolver, that performing the second long-standing operation would conflict with the first long-standing operation. The method further includes identifying one or more compromise operations for the digital assistant to perform, and instructing the digital assistant to perform a selected compromise operation among the identified one or more compromise operations.
G10L 17/02 - Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
G10L 17/06 - Decision making techniques; Pattern matching strategies
A voltage regulator having a multiple of main stages and at least one accelerated voltage regulator (AVR) bridge is provided. The main stages may respond to low frequency current transients and provide DC output voltage regulation. The AVR bridges are switched much faster than the main stages and respond to high frequency current transients without regulating the DC output voltage. The AVR bridge frequency response range can overlap with the main stage frequency response range, and the lowest frequency to which the AVR bridges respond may be set lower than the highest frequency to which the main stages respond.
H02M 3/335 - Conversion of dc power input into dc power output with intermediate conversion into ac by static converters using discharge tubes with control electrode or semiconductor devices with control electrode to produce the intermediate ac using devices of a triode or a transistor type requiring continuous application of a control signal using semiconductor devices only
H02M 1/00 - APPARATUS FOR CONVERSION BETWEEN AC AND AC, BETWEEN AC AND DC, OR BETWEEN DC AND DC, AND FOR USE WITH MAINS OR SIMILAR POWER SUPPLY SYSTEMS; CONVERSION OF DC OR AC INPUT POWER INTO SURGE OUTPUT POWER; CONTROL OR REGULATION THEREOF - Details of apparatus for conversion
99.
WORKLOAD SCHEDULING USING QUEUES WITH DIFFERENT PRIORITIES
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for scheduling workloads on computing resources using a high priority queue and a low priority queue. The high priority queue maintains pending high priority workloads to be scheduled for execution, and the low priority queue maintains pending low priority workloads to be scheduled for execution. The computing system as described in this specification schedules the pending low priority workloads for execution by utilizing computing resources provided by the system only when the high priority queue is empty.
Example embodiments of the present disclosure provide an example computer-implemented method for constructing a three-dimensional semantic segmentation of a scene from two-dimensional inputs. The example method includes obtaining, by a computing system comprising one or more processors, an image set comprising one or more views of a subject scene. The example method includes generating, by the computing system and based at least in part on the image set, a scene representation describing the subject scene in three dimensions. The example method includes generating, by the computing system and using a machine-learned semantic segmentation model framework, a multidimensional field of probability distributions over semantic categories, the multidimensional field defined over the three dimensions of the subject scene. The example method includes outputting, by the computing system, classification data for at least one location in the subject scene.
G06V 10/26 - Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
G06T 7/143 - Segmentation; Edge detection involving probabilistic approaches, e.g. Markov random field [MRF] modelling