An electronic device obtains a plurality of media items, including, for each media item in the plurality, a set of attributes of the media item. The device provides the set of attributes for each media item of the plurality of media items to a machine learning model that is trained to determine a pairwise similarity between respective media items in the plurality of media items and generates an acyclic graph of an output of the machine learning model that is trained to determine pairwise similarity distances between respective media items in the plurality of media items. The device clusters nodes of the acyclic graph, each node corresponding to a media item. Based on the clustering, the electronic device modifies metadata associated with a first media item in a first cluster and displays a representation of the first media item in a user interface according to the modified metadata.
G06F 16/64 - Navigation; Visualisation à cet effet
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
A method includes obtaining lyrics text and audio for a media item and generating, using a first encoder, a first plurality of embeddings representing symbols that appear in the lyrics text for the media item. The method includes generating, using a second encoder, a second plurality of embeddings representing an acoustic representation of the audio for the media item. The method includes determining respective similarities between embeddings of the first plurality of embeddings and embeddings of the second plurality of embeddings and aligning the lyrics text and the audio for the media item based on the respective similarities. The method includes, while streaming the audio for the media item, providing, for display, the aligned lyrics text with the streamed audio.
Simulator augmented content selection is provided by initializing a content selection object according to session initialization parameter values associated with a simulated media content playback session. The content selection object corresponds to a candidate content selection machine learning model trained to predict selectable content media items for at least one simulated user. A simulated session including a sequence of predicted simulated user next actions and one or more predicted sets of selectable content items are generated by applying a simulated user model to content items identified by the initialized content selection object, where the simulated user model is trained to predict a next action of the simulated user in response to a simulated playback input received from the simulated user and each set of the selectable content items are correlated to each next action in the sequence of predicted simulated user next actions.
H04L 65/1069 - Gestion de session Établissement ou terminaison d'une session
H04L 65/613 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour la commande de la source par la destination
4.
SYSTEMS AND METHODS FOR GENERATING PERSONALIZED PLAYLISTS
The various implementations described herein include methods and devices for generating personalized playlists. In one aspect, a method includes obtaining information about recent media items presented to a user, the information including data about a respective time of day and day of week each media item was presented to the user. The method further includes grouping the recent media items into clusters based on time of day and day of week; and generating a recommendation vector using a weighted average of the clusters. The method also includes generating a playlist for the user by identifying a plurality of media items using the recommendation vector; and presenting the playlist to the user.
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
The various implementations described herein include methods and devices for identifying and presenting content to users. In one aspect, a method includes providing a domain specific language (DSL) tool to a user of the computing device and receiving a plurality of user inputs via the DSL tool. The plurality of user inputs includes: an input identifying a DSL object corresponding to a media pool; an input identifying a DSL object corresponding to a mutator to be applied to the media pool; and inputs identifying a plurality of DSL objects corresponding to respective objectives for a media set list. The method also includes generating the media set list from the media pool based on the mutator and the objectives and presenting information about the generated media set list to the user.
Contrastive learning is used to learn an alternative embedding. A subtree replacement strategy generates structurally similar pairs of samples from an input space for use in contrastive learning. The resulting embedding captures more of the structural proximity relationships of the input space and improves Bayesian optimization performance when applied to tasks such as fitting and optimization.
The present application describes various methods and devices for providing content to users. In one aspect, a method includes, for each content item of a set of content items, obtaining a score for the content item using a recommender system, the score corresponding to a calculation of subsequent repeated engagement by a user with the content item. The method also includes ranking the set of content items based on the respective scores and providing recommendation information to the user for one or more highest ranked content items in the set of content items.
An electronic device generates a respective user queue for each user of a plurality of users participating in a shared listening session. While providing a first media content item for playback, the device receives a second request, from a first user, to add a second media content item to the shared playback queue and updates the respective user queue for the first user. After receiving the second request, the electronic device receives a third request, from a second user, to add a third media content item to the shared playback queue and updates the respective user queue for the second user. The electronic device updates the shared playback queue using the respective user queues of the first user and the second user, including positioning the third media content item in an order of the shared playback queue to be played back before the second media content item.
H04N 21/472 - Interface pour utilisateurs finaux pour la requête de contenu, de données additionnelles ou de services; Interface pour utilisateurs finaux pour l'interaction avec le contenu, p.ex. pour la réservation de contenu ou la mise en place de rappels, pour la requête de notification d'événement ou pour la transformation de contenus affichés
H04N 21/25 - Opérations de gestion réalisées par le serveur pour faciliter la distribution de contenu ou administrer des données liées aux utilisateurs finaux ou aux dispositifs clients, p.ex. authentification des utilisateurs finaux ou des dispositifs clients ou
H04N 21/258 - Gestion de données liées aux clients ou aux utilisateurs finaux, p.ex. gestion des capacités des clients, préférences ou données démographiques des utilisateurs, traitement des multiples préférences des utilisateurs finaux pour générer des données co
H04N 21/262 - Ordonnancement de la distribution de contenus ou de données additionnelles, p.ex. envoi de données additionnelles en dehors des périodes de pointe, mise à jour de modules de logiciel, calcul de la fréquence de transmission de carrousel, retardement d
H04N 21/442 - Surveillance de procédés ou de ressources, p.ex. détection de la défaillance d'un dispositif d'enregistrement, surveillance de la bande passante sur la voie descendante, du nombre de visualisations d'un film, de l'espace de stockage disponible dans l
H04N 21/45 - Opérations de gestion réalisées par le client pour faciliter la réception de contenu ou l'interaction avec le contenu, ou pour l'administration des données liées à l'utilisateur final ou au dispositif client lui-même, p.ex. apprentissage des préféren
H04N 21/458 - Ordonnancement de contenu pour créer un flux personnalisé, p.ex. en combinant une publicité stockée localement avec un flux d'entrée; Opérations de mise à jour, p.ex. pour modules de système d'exploitation
H04N 21/466 - Procédé d'apprentissage pour la gestion intelligente, p.ex. apprentissage des préférences d'utilisateurs pour recommander des films
This disclosure is directed to an enhanced audio file generator. One aspect is a method of enhancing input speech in an input audio file, the method comprising receiving the input audio file representing the input speech, wherein the input audio file is recorded at an audio recording device, and generating an enhanced audio file by applying an audio transformation model to the input audio file, wherein applying the audio transformation model to generate the enhanced audio file comprises extracting parameters defining audio features from the input audio file, the parameters including a noise parameter defining noise in the input audio file and one or more other preset parameters respectively defining other audio features, synthesizing clean speech based on the extracted parameters including the noise parameter, wherein synthesizing the clean speech comprises transforming the noise parameter to defined value(s); and generating the enhanced audio file with the synthesized clean speech.
G10L 21/0264 - Filtration du bruit caractérisée par le type de mesure du paramètre, p.ex. techniques de corrélation, techniques de passage par zéro ou techniques prédictives
G10L 13/047 - Architecture des synthétiseurs de parole
G10L 25/03 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de paramètres extraits
G10L 25/30 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
G10L 25/51 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
In general, this disclosure is directed to generating and discovering a group media playback session that is conducted in a media playback device. One aspect is a method of controlling a media output device, the method comprising establishing a media playback session at a host device through the media output device, wirelessly broadcasting a participant ID from a participant device to the host device, associating, at the host device, the participant ID with a session ID for the media playback session and sending the association to a server, and transmitting session information from the server to the participant device to permit the participant device to join the media playback session to adjust playback at the media output device.
H04L 65/1069 - Gestion de session Établissement ou terminaison d'une session
H04N 21/63 - Signalisation de contrôle entre des éléments du client, serveur et réseau; Procédés liés au réseau pour la distribution de vidéo entre serveur et clients, p.ex. transmission de la couche de base et des couches d’amélioration sur des voies de transmission différentes, mise en œuvre d’une communication pair à pair via Interne; Protocoles de communication; Adressage
A request to play a media content item is received. It is determined whether the play request is ambiguous. Responsive to determining that the play request is ambiguous, then it is determined whether to play a suspended media content item or an alternate media content item. The determination can be made based on a length of time that the suspended media content item has been suspended, a media content item type, or a state, among other factors. Responsive to the determination, playback of the suspended or alternate media content item is initiated.
H04L 65/613 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour la commande de la source par la destination
This disclosure concerns the provision of media, and more particularly streaming of media. In particular, one aspect herein relates to a method performed by a server system of streaming an audio content item to an electronic device. In response to receiving a request message from the electronic device, a selected audio content item is retrieved from a first storage. Descriptive metadata including an origin-ID associated with the retrieved audio content item is determined. A second storage is browsed utilizing said metadata including the origin-ID to locate non-static media content item(s) associated with the origin-ID. In response to finding a non-static media content item associated with the origin ID, the selected audio content item is sent along with the located non-static media content item to the electronic device for simultaneous presentation of the audio content item and the located non-static media content item.
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
H04N 21/4722 - Interface pour utilisateurs finaux pour la requête de contenu, de données additionnelles ou de services; Interface pour utilisateurs finaux pour l'interaction avec le contenu, p.ex. pour la réservation de contenu ou la mise en place de rappels, pour la requête de notification d'événement ou pour la transformation de contenus affichés pour la requête de données additionnelles associées au contenu
H04N 21/431 - Génération d'interfaces visuelles; Rendu de contenu ou données additionnelles
G06F 16/48 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/583 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G06F 16/783 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
Utterance-based user interfaces can include activation trigger processing techniques for detecting activation triggers and causing execution of certain commands associated with particular command pattern activation triggers without waiting for output from a separate speech processing engine. The activation trigger processing techniques can also detect speech analysis patterns and selectively activate a speech processing engine.
An electronic device associated with a media-providing service displays a user interface that includes a representation of a first media item. While the representation of the first media item is displayed, the electronic device initiates playback of a preview of the first media item. The electronic device further detects a first input by a user to display a representation of a second media item. Then, while the electronic device is displaying the representation of the second media item, the electronic device initiates playback of a preview of the second media item. Based on a determination that the preview of the second media item has completed playback, the electronic device plays the second media item and adds the second media item to the user's playback history without further intervention by the user.
H04N 21/482 - Interface pour utilisateurs finaux pour la sélection de programmes
G06F 3/0485 - Défilement ou défilement panoramique
G06F 3/0488 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] utilisant des caractéristiques spécifiques fournies par le périphérique d’entrée, p.ex. des fonctions commandées par la rotation d’une souris à deux capteurs, ou par la nature du périphérique d’entrée, p.ex. des gestes en fonction de la pression exer utilisant un écran tactile ou une tablette numérique, p.ex. entrée de commandes par des tracés gestuels
15.
SYSTEMS AND METHODS FOR GENERATING PERSONALIZED POOLS OF CANDIDATE MEDIA ITEMS
An electronic device stores, for a user of a media-providing service, a playback history that includes information about media items that have previously been consumed by the user. The electronic device receives a request to search for media content including search criteria. In response to the request, and without additional user intervention, the electronic device generates a vector representation of the user using media items from the playback history of the user that are relevant to the search criteria. The electronic device identifies one or more media content items from a media content library that match the vector representation of the user and the search criteria and provides, to the user, the one or more media content items.
Methods, systems and computer program products are provided for content generation. A distribution of policies is defined based on an action space. Distribution parameters are received from a reinforcement learning (RL) algorithm. In turn, a policy is randomly sampled from the distribution of policies. A candidate content item is generated using the sampled policy. A quality of the candidate content item is measured based on a predefined quality criteria and a parameter model is adjusted as specified by the reinforcement learning algorithm to obtain a plurality of updated distribution parameters. Environment settings are passed to a trained parameter model to obtain a plurality of policy distribution parameters. A predetermined number of policies from the distribution of policies are then sampled and the plurality of environment settings are passed to the predetermined number of sampled policies to obtain at least one content item.
A media content item recommendation system recommends media content items based on one or more attributes of a seed playlist. The recommended media content items can be determined from a plurality of existing playlists that have been created over a period of time. Such existing playlists can be selected based on similarity to the seed playlist.
A cuepoint determination system utilizes a convolutional neural network (CNN) to determine cuepoint placements within media content items to facilitate smooth transitions between them. For example, audio content from a media content item is normalized to a plurality of beats, the beats are partitioned into temporal sections, and acoustic feature groups are extracted from each beat in one or more of the temporal sections. The acoustic feature groups include at least downbeat confidence, position in bar, peak loudness, timbre and pitch. The extracted acoustic feature groups for each beat are provided as input to the CNN on a per temporal section basis to predict whether a beat immediately following the temporal section within the media content item is a candidate for cuepoint placement. A cuepoint placement is then determined from among the candidate cuepoint placements predicted by the CNN.
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
The various implementations described herein include methods and devices for media segmentation. In one aspect, a method includes obtaining audio content for a podcast and generating sentence embeddings for the audio content. The method also includes generating segment embeddings using the sentence embeddings and context information, and determining, for each segment embedding, whether the segment embedding includes a topic transition for the podcast. The method further includes generating one or more topic transition timestamps for the podcast in accordance with the determining.
The various implementations described herein include methods and devices for media discovery. In one aspect, a method includes obtaining a pre-trained recommender model that has been trained using contrastive learning with feature-level augmentation and instance-level augmentation. The method further includes generating, via the model, a user embedding based on features of the user and generating, via the model, a respective episode embedding for each episode of a plurality of episodes, each respective episode embedding based on features of the corresponding episode. The method also includes generating, via the model, a respective similarity score (corresponding to a latent similarity between the user embedding and each respective episode embedding) for each episode, the respective similarity score, and ranking the episodes in accordance with the respective similarity scores. The method further includes recommending the highest ranked episode to the user.
The various implementations described herein include methods and devices for speaker diarization. In one aspect, a method includes obtaining an audio recording and generating an embedding signal from the audio recording. The method further includes factoring the embedding signal to obtain a basis matrix and an activation matrix, including obtaining a sparse optimization of the embedding signal by minimizing a norm corresponding to the factored embedding signal. The method also includes generating a speaker log for the audio recording based on the sparse optimization of the embedding signal.
Systems, methods, and devices for human-machine interfaces for utterance-based playlist selection are disclosed. In one method, a list of playlists is traversed and a portion of each is audibly output until a playlist command is received. Based on the playlist command, the traversing is stopped and a playlist is selected for playback. In examples, the list of playlists is modified based on a modification input.
A server obtains user data corresponding to a first content domain. The server identifies, from the user data, a plurality of labels. A respective label of the plurality of labels corresponds to a distinct characteristic of content items of the first content domain. The server utilizes a neural network to generate a plurality of user embeddings. A respective user embedding of the plurality of user embeddings includes a plurality of labels that correspond to a respective user. The server determines, using the plurality of user embeddings, a first content item of a plurality of content items of a second type that meets matching criteria for a first user. The server further provides, to a device of the first user, information that corresponds to the first content item of the second content domain.
G06F 16/435 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
G06F 16/438 - Présentation des résultats des requêtes
G06F 16/483 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
24.
PERSONALIZING EXPLAINABLE RECOMMENDATIONS WITH BANDITS
Methods, systems and computer program products are provided personalizing recommendations of items with associated explanations. The example embodiments described herein use contextual bandits to personalize explainable recommendations (“recsplanations”) as treatments (“Bart”). Bart learns and predicts satisfaction (e.g., click-through rate, consumption probability) for any combination of item, explanation, and context and, through logging and contextual bandit retraining, can learn from its mistakes in an online setting.
G06F 16/635 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
G06F 16/638 - Présentation des résultats des requêtes
G06F 18/21 - Conception ou mise en place de systèmes ou de techniques; Extraction de caractéristiques dans l'espace des caractéristiques; Séparation aveugle de sources
25.
USING A HIERARCHICAL MACHINE LEARNING ALGORITHM FOR PROVIDING PERSONALIZED MEDIA CONTENT
An electronic device generates a score for each objective in a hierarchy of objectives. Generating the score comprises, using a first machine learning algorithm, generating a score for a first objective corresponding to a first level in the hierarchy of the objectives and using an output of the first machine learning algorithm, distinct from the score for the first objective, as an input to a second machine learning algorithm to generate a score for a second objective corresponding to a second level in the hierarchy of objectives. The electronic device generates a combined score using the score for the first objective and the score for the second objective. The electronic device selects, automatically without user input, media content based on the combined scores for the plurality of media content items and streams, using an application of the media-providing service, one or more of the selected media content to a user.
A full attention mechanism of a multilingual transformer model is converted into a Longformer attention mechanism to generate a Longformer multilingual transformer model. The Longformer multilingual transformer model is finetuned to perform a summarization task based on episode-description:episode-transcript pairs, thereby generating a finetuned Longformer multilingual transformer model. The Longformer multilingual transformer model also can further be finetuned to perform a summarization task based on article-summary:full-original-article pairs. A summary of a query episode transcript can be generated using the single-finetuned Longformer multilingual transformer model and/or the double-finetuned Longformer multilingual transformer model. The multilingual transformer-based model enables systems, methods and computer products to be capable of generating multilingual abstractive summaries.
G06F 40/58 - Utilisation de traduction automatisée, p.ex. pour recherches multilingues, pour fournir aux dispositifs clients une traduction effectuée par le serveur ou pour la traduction en temps réel
An electronic device receives a first media content item and receives information indicating: a first insertion time within the first media content item; and a second media content item to be played at the first insertion time and/or one or more properties of the second media content item. The electronic device stores the first media content item. The electronic device provides the first media content item to the second electronic device, including queuing the second electronic device to playback, in sequence and without user intervention: the first media content item until the first insertion time; the second media content item at the first insertion time; and the first media content item resumed after playback of the second media content item is ceased.
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
In accordance with an embodiment, described herein is a system and method for providing a live lyrics overlay in a social messaging environment. The system can utilize advances in three-dimensional mapping technology that allow social messaging services, to offer real time video lenses or overlays to their users, and extends this three-dimensional mapping technology to support for lyrics. During creation of a video with lyrics lens overlay, the lyrics corresponding to a selected song are retrieved from a lyrics source, and are displayed within the video. For example, with the lyrics lens, a user can record an image of themselves on live video, singing along to a song clip, with the lyrics of the song displayed as if they appear to be coming from their mouths. The created live lyrics content can also be shared with other users of a social messaging environment.
G06F 16/40 - Recherche d’informations; Structures de bases de données à cet effet; Structures de systèmes de fichiers à cet effet de données multimédia, p.ex. diaporama comprenant des données d'image et d’autres données audio
G06F 16/435 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
G06F 16/9535 - Adaptation de la recherche basée sur les profils des utilisateurs et la personnalisation
G06Q 10/107 - Gestion informatisée du courrier électronique
G06Q 50/00 - Systèmes ou procédés spécialement adaptés à un secteur particulier d’activité économique, p.ex. aux services d’utilité publique ou au tourisme
H04L 51/52 - Messagerie d'utilisateur à utilisateur dans des réseaux à commutation de paquets, transmise selon des protocoles de stockage et de retransmission ou en temps réel, p.ex. courriel pour la prise en charge des services des réseaux sociaux
G06F 16/483 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G06F 16/9536 - Personnalisation de la recherche basée sur le filtrage social ou collaboratif
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p.ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
H04N 21/431 - Génération d'interfaces visuelles; Rendu de contenu ou données additionnelles
H04N 21/462 - Gestion de contenu ou de données additionnelles, p.ex. création d'un guide de programmes électronique maître à partir de données reçues par Internet et d'une tête de réseau ou contrôle de la complexité d'un flux vidéo en dimensionnant la résolution o
G06F 3/0482 - Interaction avec des listes d’éléments sélectionnables, p.ex. des menus
H04L 65/403 - Dispositions pour la communication multipartite, p.ex. pour les conférences
H04N 21/222 - Serveurs secondaires, p.ex. serveur proxy ou tête de réseau de télévision par câble
H04N 21/233 - Traitement de flux audio élémentaires
H04N 21/234 - Traitement de flux vidéo élémentaires, p.ex. raccordement de flux vidéo ou transformation de graphes de scènes MPEG-4
H04N 21/235 - Traitement de données additionnelles, p.ex. brouillage de données additionnelles ou traitement de descripteurs de contenu
H04N 21/435 - Traitement de données additionnelles, p.ex. décryptage de données additionnelles ou reconstruction de logiciel à partir de modules extraits du flux de transport
H04N 21/439 - Traitement de flux audio élémentaires
H04N 21/44 - Traitement de flux élémentaires vidéo, p.ex. raccordement d'un clip vidéo récupéré d'un stockage local avec un flux vidéo en entrée ou rendu de scènes selon des graphes de scène MPEG-4
H04N 21/4788 - Services additionnels, p.ex. affichage de l'identification d'un appelant téléphonique ou application d'achat communication avec d'autres utilisateurs, p.ex. discussion en ligne
H04N 21/8545 - Création de contenu pour générer des applications interactives
29.
SYSTEMS AND METHODS FOR BIDIRECTIONAL COMMUNICATION WITHIN A WEBSITE DISPLAYED WITHIN A MOBILE APPLICATION
A method is performed at an electronic device. The method includes displaying, in a mobile application provided by a media content provider, a user interface that includes one or more media content items. The method further includes displaying, within a browser displayed within the mobile application, external content that is associated with a content provider distinct from the media content provider, including displaying a first set of controls within the external content. The method includes, while displaying the external content, receiving a first user input selecting a first control of the first set of controls and, in response to the first user input selecting the first control, sending a command to the mobile application to perform an action and performing, by the mobile application, the action corresponding to the first control.
G06F 16/954 - Navigation, p.ex. en utilisant la navigation par catégories
G06F 16/438 - Présentation des résultats des requêtes
G06F 16/435 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
G06F 3/0484 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p.ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs
30.
METHODS AND SYSTEMS FOR INTERACTIVE QUEUING FOR SHARED LISTENING SESSIONS BASED ON USER SATISFACTION
An electronic device stores a shared playback queue for a shared playback session, the shared playback queue comprising one or more media content items, including a first media content item associated with a first user and a second media content item associated with a second user of the plurality of users. The device receives a request to adjust the shared playback queue. The device determines an order for the adjusted shared playback queue based at least in part on media preferences indicated in a profile of a third user of the plurality of users participating in the shared playback session, wherein the third user is distinct from the first user and the second user. The device provides the first media content item and the second media content item based on the order of the shared playback queue.
H04N 21/482 - Interface pour utilisateurs finaux pour la sélection de programmes
H04N 21/475 - Interface pour utilisateurs finaux pour acquérir des données d'utilisateurs finaux, p.ex. numéro d'identification personnel [PIN] ou données de préférences
H04N 21/45 - Opérations de gestion réalisées par le client pour faciliter la réception de contenu ou l'interaction avec le contenu, ou pour l'administration des données liées à l'utilisateur final ou au dispositif client lui-même, p.ex. apprentissage des préféren
31.
SYSTEMS AND METHODS FOR SELECTING IMAGES FOR A MEDIA ITEM
An electronic device obtains a collection of images, where each image in the collection of images is associated with a first set of scores. The set of first scores includes text descriptors associated with each image. The electronic devices obtains a media item associated with a second score. The second score is associated with each respective text descriptor of the set of text descriptors. The electronic device then selects a subset of the collection of images based on the first set of scores and the second set of scores. The electronic device concurrently presents a respective image of the subset of the collection of images and the media item.
G06F 16/438 - Présentation des résultats des requêtes
G06F 16/535 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
G06F 16/583 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
A system for device discovery for social playback is disclosed. The system operates to connect a host media playback device to a media output device and broadcast a social playback session to guest media playback devices. Upon joining a social playback session, a guest media playback device may control the media playback at the host media playback device. Where the media output for the social playback session is provided by the media output device.
H04L 65/60 - Diffusion en flux de paquets multimédias
H04W 4/80 - Services utilisant la communication de courte portée, p.ex. la communication en champ proche, l'identification par radiofréquence ou la communication à faible consommation d’énergie
Audio translation system includes a feature extractor and a style transfer machine learning model. The feature extractor generates for each of a plurality of source voice files one or more source voice parameters encoded as a collection of source feature vectors, and generates for each of a plurality of target voice files one or more target voice parameters encoded as a collection of target feature vectors. The style transfer machine learning model trained on the collection of source feature vectors for the plurality of source voice files and the collection of target feature vectors for the plurality of target voice files to generate a style transformed feature vector.
G10L 21/003 - Changement de la qualité de la voix, p.ex. de la hauteur tonale ou des formants
G10L 25/45 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par le type de fenêtre d’analyse
G10L 25/75 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes pour la modélisation des paramètres du conduit vocal
G10L 15/06 - Création de gabarits de référence; Entraînement des systèmes de reconnaissance de la parole, p.ex. adaptation aux caractéristiques de la voix du locuteur
34.
Selection of a wireless device to be remotely controlled by a user interface device for media presentation
A method includes receiving a Bluetooth Low Energy (BLE) advertising message from a user interface (UI) device. The method includes, responsive to a receipt of the BLE advertising message from the UI device: waking up an application module of the first wireless device and authorizing the UI device to remotely control media presentation as presented by the application module. The method includes determining a first determination of whether the first wireless device is paired or is in a current cabled connection with an electronic device that is distinct from the UI device; and in accordance with the first determination being a determination that the first wireless device is not paired with the electronic device and is not in a current cabled connection with the electronic device, automatically terminating the authorization of the UI device to remotely control media presentation as presented by the application module.
H04N 21/414 - Plate-formes spécialisées de client, p.ex. récepteur au sein d'une voiture ou intégré dans un appareil mobile
H04L 67/125 - Protocoles spécialement adaptés aux environnements propriétaires ou de mise en réseau pour un usage spécial, p.ex. les réseaux médicaux, les réseaux de capteurs, les réseaux dans les véhicules ou les réseaux de mesure à distance en impliquant la commande des applications des terminaux par un réseau
H04N 21/41 - Structure de client; Structure de périphérique de client
H04W 4/48 - Services spécialement adaptés à des environnements, à des situations ou à des fins spécifiques pour les véhicules, p.ex. communication véhicule-piétons pour la communication dans le véhicule
H04W 4/80 - Services utilisant la communication de courte portée, p.ex. la communication en champ proche, l'identification par radiofréquence ou la communication à faible consommation d’énergie
Media content episodes are received. Using machine learning, one or more media segments of interest are identified in each of the media content episodes based at least in part on an analysis of content included in a corresponding audio content episode. Each of the identified media segments is associated with one or more automatically determined tags. Using machine learning, a recommended media segment is selected for a specific user from the identified media segments based at least in part on attributes of the specific user and the automatically determined tags of the identified media segments. The recommended media segment is automatically provided in an media segment feed.
G10L 25/51 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
G06F 3/14 - Sortie numérique vers un dispositif de visualisation
G10L 17/00 - Identification ou vérification du locuteur
G06N 5/04 - Modèles d’inférence ou de raisonnement
G06F 16/638 - Présentation des résultats des requêtes
G06F 16/64 - Navigation; Visualisation à cet effet
Methods, systems, and related products that provide detection of media content items that are under-locatable by machine voice-driven retrieval of uttered requests for retrieval of the media items. For a given media item, a resolvability value and/or an utterance resolve frequency is calculated by a number of playbacks of the media item by a speech retrieval modality to a total number of playbacks of the media item regardless of retrieval modality. In some examples, the methods, systems and related products also provide for improvement in the locatability of an under-locatable media item by collecting and/or generating one or more pronunciation aliases for the under-locatable item.
A text-to-speech engine creates audio output that includes synthesized speech and one or more media content item snippets. The input text is obtained and partitioned into text sets. A track having lyrics that match a part of one of the text sets is identified. The location of the track's audio that contains the lyric is extracted based on forced alignment data. The extracted audio is combined with synthesized speech corresponding to the remainder of the input text to form audio output.
G10L 13/00 - Synthèse de la parole; Systèmes de synthèse de la parole à partir de texte
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G10L 13/04 - Procédés d'élaboration de parole synthétique; Synthétiseurs de parole - Détails des systèmes de synthèse de la parole, p.ex. structure du synthétiseur ou gestion de la mémoire
A method of determining relations between music items, wherein a music item is a submix of a musical composition comprising one or more music tracks, the method comprising determining a first input representation for at least part of a first music item, mapping the first input representation onto to one or more subspaces derived from a vector space using a first model, wherein each subspace models a characteristic of the music items, determining a second input representation for at least part of a second music item, mapping the second input representation onto the one or more subspaces using a second model, and determining a distance between the mappings of the first and second input representations in each subspace, wherein the distance represents the degree of relation between the first and second input representations with respect to the characteristic modelled by the subspace.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
A method of determining relations between music items, the method comprising determining a first input representation for a symbolic representation of a first music item, mapping the first input representation onto to one or more subspaces derived from a vector space using a first model, wherein each subspace models a characteristic of the music items, determining a second input representation for music data representing a second music item, mapping the second input representation onto the one or more subspaces using a second model, determining a distance between the mappings of the first and second input representation in each subspace, wherein the distance represents the degree of relation between the first and second input representation with respect to the characteristic modelled by the subspace.
A method for training a speech synthesis model adapted to output speech in response to input text is provided. The method includes receiving training data for training said speech synthesis model, the training data comprising speech that corresponds to known text. The method includes training said speech synthesis model. The method includes testing said speech synthesis model using a plurality of text sequences. The method includes calculating at least one metric indicating the performance of the model when synthesising each text sequence. The method includes determining from said metric whether the speech synthesis model requires further training. The method includes determining targeted training text from said calculated metrics, wherein said targeting training text is text related to text sequences where the metric indicated that the model required further training. And the method includes outputting said determined targeted training text with a request further speech corresponding to the targeted training text.
G10L 13/047 - Architecture des synthétiseurs de parole
G10L 13/08 - Analyse de texte ou génération de paramètres pour la synthèse de la parole à partir de texte, p.ex. conversion graphème-phonème, génération de prosodie ou détermination de l'intonation ou de l'accent tonique
A system and method for media content sequencing. Prior tracks for a listening session are segmented into groups based on attribute scores for an audial attribute. A preferred group is then selected, which can be based on user feedback regarding the prior tracks in the listening session. Candidate tracks, such as from a candidate track pool for future playback in the listening session, are also segmented into the groups of the prior tracks. The candidate tracks can then be ranked based on their associated group and the preferred group.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
A second wake word detector, at a media-playback device, that plays audio (or other) content to a device, such as a voice-enabled device, detects false wake words in the audio content. The second wake word detector analyzes the audio stream to determine if the audio stream contains any audio that sounds like the wake word. If so, the second wake word detector can generate one of a plurality of instructions that describes the time period, within the audio content, in which the false wake word was encountered. The instruction can cause a first wake word detector to assume one of a plurality of configurations. The media-playback device can then instruct or inform the voice-enabled device of the presence of the false wake word. In this way, the wake word detector, at the voice-enabled device, is not activated to receive the false wake word or ignores the wake word.
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p.ex. dialogue homme-machine
G10L 15/20 - Techniques de reconnaissance de la parole spécialement adaptées de par leur robustesse contre les perturbations environnantes, p.ex. en milieu bruyant ou reconnaissance de la parole émise dans une situation de stress
A wake word detector, at a server of a content delivery network (CDN) that provides audio (or other) content to a device, such as a voice-enabled device, detects false wake words in the audio content. The CDN wake word detector analyzes the audio stream to determine if the audio stream contains any audio that sounds like the wake word. If so, the CDN wake word detector can generate metadata that describes the time period, within the audio content, in which the false wake word was encountered. The metadata can include time offsets, from the start of the audio content, which can instruct a voice-enabled device to deactivate during the time period. This metadata is stored and then sent to the media-playback device requests the media content. The media-playback device can then instruct or inform the voice-enabled device of the presence of the false wake word. In this way, the wake word detector, at the voice-enabled device, is not activated to receive the false wake word.
Systems, devices, apparatuses, components, methods, and techniques for predicting user and media-playback device states are provided. Systems, devices, apparatuses, components, methods, and techniques for representing cached, user-selected, and streaming content are also provided.
G06F 15/167 - Communication entre processeurs utilisant une mémoire commune, p.ex. boîte aux lettres électronique
G06N 5/02 - Représentation de la connaissance; Représentation symbolique
G06F 12/0888 - Adressage d’un niveau de mémoire dans lequel l’accès aux données ou aux blocs de données désirés nécessite des moyens d’adressage associatif, p.ex. mémoires cache utilisant la mémorisation cache sélective, p.ex. la purge du cache
H04N 21/231 - Opération de stockage de contenu, p.ex. mise en mémoire cache de films pour stockage à court terme, réplication de données sur plusieurs serveurs, ou établissement de priorité des données pour l'effacement
H04L 67/5681 - Pré-extraction ou pré-livraison de données en fonction des caractéristiques du réseau
G06F 12/14 - Protection contre l'utilisation non autorisée de mémoire
45.
TEXT-TO-SPEECH SYNTHESIS METHOD AND SYSTEM, AND A METHOD OF TRAINING A TEXT-TO-SPEECH SYNTHESIS SYSTEM
A text-to-speech synthesis method includes receiving text, inputting the received text in a synthesizer that includes a prediction network configured to convert the received text into speech data having a speech attribute that includes emotion, intention, projection, pace, and/or accent, and outputting said speech data. The prediction network is obtained by obtaining a first sub-dataset and a second sub-dataset, where the first sub-dataset and the second sub-dataset each include audio samples and corresponding text, and the speech attribute of the audio samples of the second sub-dataset is more pronounced than the speech attribute of the audio samples of the first sub-dataset, training a first model using the first sub-dataset until a performance metric reaches a first predetermined value, training a second model by further training the first model using the second sub-dataset until the performance metric reaches a second predetermined value, and selecting one trained model as the prediction network.
G10L 13/033 - Procédés d'élaboration de parole synthétique; Synthétiseurs de parole Édition de voix, p.ex. transformation de la voix du synthétiseur
G10L 13/047 - Architecture des synthétiseurs de parole
G10L 13/027 - Synthétiseurs de parole à partir de concepts; Génération de phrases naturelles à partir de concepts automatisés
G10L 13/08 - Analyse de texte ou génération de paramètres pour la synthèse de la parole à partir de texte, p.ex. conversion graphème-phonème, génération de prosodie ou détermination de l'intonation ou de l'accent tonique
Methods, systems and computer program products are provided for determining acoustic feature vectors of query and target items in a first vector space, and mapping the acoustic feature vectors to a second vector space having a lower dimension. The distribution of vectors in the second vector space can then be used to identify items from the same songs, and/or items that are complementary. A mapping function is trained using a machine learning algorithm, such that complementary audio items are closer in the second vector space than the first, according to a given distance metric.
G10L 25/51 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
G10L 25/30 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
A method, which may be performed at an electronic device, such as a media server associated with a media-providing service, causes a set of media items to be provided to a user based on identifying performance listings relevant to the user. The method includes determining a list of one or more performance listings of artists relevant to a user based on a media consumption history of the user, the media consumption history describing media content items previously delivered to the user by a media content server, and a listening profile of a second user, distinct from the first user, the listening profile identifying media content and artists played by the second user via the media content server. The method includes providing one or more media items to the user, the one or more media items selected based on the list of one or more performance listings.
A descriptive media content search solution is provided to allow a user to search for media content that better matches a user's descriptive search request. The descriptive media content search solution utilizes an extensive catalog of playlists each having a playlist description, such as a playlist title or other descriptive text, and identifies additional descriptive information for media content items to be searched. The descriptive media content search solution can set up a descriptive search database and utilize the descriptive search database to conduct a descriptive search responsive to the user's descriptive search request.
G06F 16/48 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/41 - Indexation; Structures de données à cet effet; Structures de stockage
G06F 16/438 - Présentation des résultats des requêtes
G06F 16/2457 - Traitement des requêtes avec adaptation aux besoins de l’utilisateur
49.
Systems and methods for using hierarchical ordered weighted averaging for providing personalized media content
An electronic device, for each media content item of a plurality of media content items, receives a respective score for each a first set of objectives and one or more other objectives and generates a respective score between a user and the media content item. The generating includes applying a first ordered weighted average to the respective scores for the first set of objectives, to produce a first combined score for the first set of objectives, applying a second ordered weighted average to the respective scores for a second set of objectives, wherein the second set of objectives includes (i) a resulting objective corresponding to the first set of objectives and having the first combined score and (ii) the one or more other objectives. The electronic device streams media content to the user selected based on the respective scores between the user and the media content items.
A first client device is associated with a first user hosting a shared playback session. While a first media content item from the shared playback session is being presented on a set of presentation devices, the first client device communicates with a set of observer devices for the shared playback session. The first client device receives a request to modify playback of the shared playback session from a second client device, the second client device being one observer device of the set of observer devices. In response to the request to modify playback of the shared playback session, the first client device determines an action to take with respect to the shared playback session. In response to determining the action to take with respect to the shared playback session, the first client device sends a command for the action to each of the set of presentation devices.
H04N 21/472 - Interface pour utilisateurs finaux pour la requête de contenu, de données additionnelles ou de services; Interface pour utilisateurs finaux pour l'interaction avec le contenu, p.ex. pour la réservation de contenu ou la mise en place de rappels, pour la requête de notification d'événement ou pour la transformation de contenus affichés
H04L 65/401 - Prise en charge des services ou des applications dans laquelle les services impliquent une session principale en temps réel et une ou plusieurs sessions parallèles additionnelles en temps réel ou sensibles au temps, p.ex. accès partagé à un tableau blanc ou mise en place d’une sous-conférence
H04N 21/4788 - Services additionnels, p.ex. affichage de l'identification d'un appelant téléphonique ou application d'achat communication avec d'autres utilisateurs, p.ex. discussion en ligne
51.
Adaptive multi-model item selection systems and methods
An adaptive multi-model item selection method, comprising: receiving, from one of a plurality of client devices, a request including a client-side feature vector representing a state of the client device; determining, by an advocate model, a probability distribution of a plurality of specialist cluster models from the client-side feature vector; choosing, by a use case selector, a cluster corresponding to a use case from the probability distribution; and obtaining, by the use case selector based on the cluster (i.e., the cluster that was sampled by the user case selector), a specialist cluster model from the plurality of specialist cluster models.
Apparatus, methods and computer-readable medium are provided for processing wind noise. Audio input is processed by receiving an audio input. A wind noise level representative of a wind noise at the microphone array is measured using the audio input and a determination is made, based on the wind noise level, whether to perform either (i) a wind noise suppression process on the audio input on-device, or (ii) the wind noise suppression process on the audio input on-device and an audio reconstruction process in-cloud.
G10L 21/0232 - Traitement dans le domaine fréquentiel
H04R 1/40 - Dispositions pour obtenir la fréquence désirée ou les caractéristiques directionnelles pour obtenir la caractéristique directionnelle désirée uniquement en combinant plusieurs transducteurs identiques
A method is provided for modifying a first media content item by superimposing a first set of data over a first audio event having an amplitude that satisfies a first threshold. The first audio event has a first audio profile, the first set of data has a second audio profile, playback of the second audio profile is configured to be masked by the first audio profile during playback of the first media content item, and the first set of data includes playlist information. The method includes transmitting, to a second electronic device, the modified first media content item.
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
54.
METHODS AND SYSTEMS FOR PROVIDING PERSONALIZED CONTENT BASED ON SHARED LISTENING SESSIONS
An electronic device receives a request, from a first device of a host user, to initiate a first shared playback session for the first device and one or more additional devices. The electronic device streams media content from a first playback queue to the first device and to the one or more additional devices, the first playback queue including one or more media content items corresponding to the first shared playback session. The electronic device determines that the first device of the host user has left the first shared playback session and, in response, maintains the first playback queue to be accessed by the one or more additional devices. After the host user has left the first shared playback session, the electronic device provides one or more media content items from the first playback queue to at least one of the one or more additional devices.
H04N 21/442 - Surveillance de procédés ou de ressources, p.ex. détection de la défaillance d'un dispositif d'enregistrement, surveillance de la bande passante sur la voie descendante, du nombre de visualisations d'un film, de l'espace de stockage disponible dans l
H04N 21/485 - Interface pour utilisateurs finaux pour la configuration du client
H04N 21/647 - Signalisation de contrôle entre des éléments du réseau et serveur ou clients; Procédés réseau pour la distribution vidéo entre serveur et clients, p.ex. contrôle de la qualité du flux vidéo en éliminant des paquets, protection du contenu contre une modification non autorisée dans le réseau ou surveillance de la charge du résea
H04N 21/44 - Traitement de flux élémentaires vidéo, p.ex. raccordement d'un clip vidéo récupéré d'un stockage local avec un flux vidéo en entrée ou rendu de scènes selon des graphes de scène MPEG-4
H04N 21/439 - Traitement de flux audio élémentaires
55.
METHODS AND SYSTEMS FOR SYNTHESISING SPEECH FROM TEXT
A method for synthesising speech from text includes receiving text and encoding, by way of an encoder module, the received text. The method further includes determining, by way of an attention module, a context vector from the encoding of the received text, wherein determining the context vector comprises at least one of: applying a threshold function to an attention vector and accumulating the thresholded attention vector, or applying an activation function to the attention vector and accumulating the activated attention vector. The method further includes determining speech data from the context vector.
G10L 13/08 - Analyse de texte ou génération de paramètres pour la synthèse de la parole à partir de texte, p.ex. conversion graphème-phonème, génération de prosodie ou détermination de l'intonation ou de l'accent tonique
G10L 13/047 - Architecture des synthétiseurs de parole
A method for personalizing media content for a user is provided. The method includes, at an electronic device, streaming a first media item from a first set of media items, the first set of media items compiled using a first recommendation hypothesis. The method further includes, while streaming the first media item, in response to a first user request, selecting, without user intervention, a second set of media items, distinct from the first set of media items, including determining a presentation order of a plurality of sets of media items using a heuristic applied to the plurality of sets of media items. The second set of media items is compiled using a second recommendation hypothesis, wherein the second recommendation hypothesis is distinct from the first recommendation hypothesis. The method includes streaming a second media item from the second set of media items.
H04L 65/613 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour la commande de la source par la destination
H04L 65/1089 - Procédures en session en supprimant des médias
This disclosure is directed to adjusting a playlist of media-content items. One aspect is a method comprising receiving a request to adjust a playlist comprising initial media-content items, in response to receiving the input requesting the playlist be adjusted, compiling a set of features for the playlist and selecting a strong seed media-content item from the initial media-content items as a strong seed, predicting scores for a plurality of candidate media-content items based at least in part on the set of features for the playlist and the strong seed, the scores indicating a likelihood that a corresponding candidate media-content item will be added to the playlist, and inserting a candidate media-content item of the plurality of candidate media-content items after the strong seed media-content item based at least in part on the scores predicted for the plurality of candidate media-content items.
An audio cancellation system includes a voice enabled computing system that is connected to an audio output device using a wired or wireless communication network. The voice enabled computing device can provide media content to a user and receive a voice command from the user. The connection between the voice enabled computing system and the audio output device introduces a time delay between the media content being generated at the voice enabled computing device and the media content being reproduced at the audio output device. The system operates to determine a calibration value adapted for the voice enabled computing system and the audio output device. The system uses the calibration value to filter the user's voice command from a recording of ambient sound including the media content, without requiring significant use of memory and computing resources.
G10L 21/0232 - Traitement dans le domaine fréquentiel
G10L 25/51 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
A method for communicating a playback order for a plurality of media content items to a user device operating in an online mode, the method performed at a server system and comprising receiving an indication that the user device will enter an offline mode, generating a playback order for the plurality of media content items, and transmitting the generated playback order to the user device before the user device enters the offline mode.
Methods, systems, and computer programs for generating a playlist of media content items without explicit content. A vector space is created that represents explicit and non-explicit tracks in the same playlists created by other users and then tracks are filtered based on cosine distance between the “seed tracks” and all the tracks in the aforementioned playlist. The explicit tracks are filtered out, and tracks are sorted based on the affinity of the user to the artist.
G06F 16/635 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
H04N 21/4545 - Signaux d'entrée aux algorithmes de filtrage, p.ex. filtrage d'une région de l'image
H04N 21/45 - Opérations de gestion réalisées par le client pour faciliter la réception de contenu ou l'interaction avec le contenu, ou pour l'administration des données liées à l'utilisateur final ou au dispositif client lui-même, p.ex. apprentissage des préféren
H04N 21/454 - Filtrage de contenu, p.ex. blocage des publicités
G06F 16/638 - Présentation des résultats des requêtes
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
A system for supporting a user's repetitive motion activity operates to manage cadence-based playlists identifying one or more media content items having a tempo corresponding to a user's cadence. The cadence-based playlists can be categorized by different tempi or tempo ranges that cover all likely cadences during the user's activities. A media-playback device is provided to acquire a user's cadence and retrieve a cadence-based playlist associated with a tempo or a tempo range corresponding to the cadence.
G06F 17/30 - Recherche documentaire; Structures de bases de données à cet effet
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G06F 16/638 - Présentation des résultats des requêtes
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/9535 - Adaptation de la recherche basée sur les profils des utilisateurs et la personnalisation
G06F 16/9538 - Présentation des résultats des requêtes
G05B 15/02 - Systèmes commandés par un calculateur électriques
An electronic device provides, to a user, a user-curated playlist, the user-curated playlist including an ordered set of media items that were added by the user. While providing a first media item in the ordered set of media items, the electronic device receives a first user input selecting an option to include recommended media items in the user-curated playlist. In response to the first user input, the electronic device updates the user-curated playlist to include a first recommended media item, the first recommended media item selected without user intervention based at least in part on attributes of the user-curated playlist. The first recommended media item is positioned in the user-curated playlist in between media items that were added to the ordered set of media items by the user.
G06F 16/638 - Présentation des résultats des requêtes
G06F 16/635 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
G06F 16/735 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
G06F 16/783 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
64.
SYSTEMS AND METHODS FOR IMPORTING AUDIO FILES IN A DIGITAL AUDIO WORKSTATION
A method includes displaying a user interface of a digital audio workstation, which includes a composition region for generating a composition. The composition region includes a representation of a first MIDI file that has already been added to the composition by a user. The method further includes receiving a user input to import, into the composition region, an audio file. In response to the user input to import the audio file, the method includes importing the audio file, which includes, without user intervention, aligning the audio file with a rhythm of the first MIDI file, modifying a rhythm of the audio file based on the rhythm of the first MIDI file, and displaying a representation of the audio file in the composition region.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
This disclosure concerns the provision of media, and more particularly streaming of media. In particular, one aspect herein relates to a method performed by a server system of streaming an audio content item to an electronic device. In response to receiving a request message from the electronic device, a selected audio content item is retrieved. A second storage is browsed utilizing to locate non-static media content item(s) associated with the selected audi content item. In response to finding a non-static media content item associated with the selected audio content item, the selected audio content item is sent along with the located non-static media content item to the electronic device for simultaneous presentation of the audio content item and the located non static media content item.
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
H04N 21/4722 - Interface pour utilisateurs finaux pour la requête de contenu, de données additionnelles ou de services; Interface pour utilisateurs finaux pour l'interaction avec le contenu, p.ex. pour la réservation de contenu ou la mise en place de rappels, pour la requête de notification d'événement ou pour la transformation de contenus affichés pour la requête de données additionnelles associées au contenu
H04N 21/431 - Génération d'interfaces visuelles; Rendu de contenu ou données additionnelles
G06F 16/48 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/583 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G06F 16/783 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
A system, method and computer product for training a neural network system. The method comprises inputting an audio signal to the system to generate plural outputs f(X, Θ). The audio signal includes one or more of vocal content and/or musical instrument content, and each output f(X, Θ) corresponds to a respective one of the different content types. The method also comprises comparing individual outputs f(X, Θ) of the neural network system to corresponding target signals. For each compared output f(X, Θ), at least one parameter of the system is adjusted to reduce a result of the comparing performed for the output f(X, Θ), to train the system to estimate the different content types. In one example embodiment, the system comprises a U-Net architecture. After training, the system can estimate various different types of vocal and/or instrument components of an audio signal, depending on which type of component(s) the system is trained to estimate.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
An electronic device generates a respective user queue for each user of a plurality of users participating in a shared listening session. While providing a first media content item for playback, the device receives a second request, from a first user, to add a second media content item to the shared playback queue and updates the respective user queue for the first user. After receiving the second request, the electronic device receives a third request, from a second user, to add a third media content item to the shared playback queue and updates the respective user queue for the second user. The electronic device updates the shared playback queue using the respective user queues of the first user and the second user, including positioning the third media content item in an order of the shared playback queue to be played back before the second media content item.
H04N 21/458 - Ordonnancement de contenu pour créer un flux personnalisé, p.ex. en combinant une publicité stockée localement avec un flux d'entrée; Opérations de mise à jour, p.ex. pour modules de système d'exploitation
H04N 21/472 - Interface pour utilisateurs finaux pour la requête de contenu, de données additionnelles ou de services; Interface pour utilisateurs finaux pour l'interaction avec le contenu, p.ex. pour la réservation de contenu ou la mise en place de rappels, pour la requête de notification d'événement ou pour la transformation de contenus affichés
H04N 21/442 - Surveillance de procédés ou de ressources, p.ex. détection de la défaillance d'un dispositif d'enregistrement, surveillance de la bande passante sur la voie descendante, du nombre de visualisations d'un film, de l'espace de stockage disponible dans l
H04N 21/25 - Opérations de gestion réalisées par le serveur pour faciliter la distribution de contenu ou administrer des données liées aux utilisateurs finaux ou aux dispositifs clients, p.ex. authentification des utilisateurs finaux ou des dispositifs clients ou
H04N 21/258 - Gestion de données liées aux clients ou aux utilisateurs finaux, p.ex. gestion des capacités des clients, préférences ou données démographiques des utilisateurs, traitement des multiples préférences des utilisateurs finaux pour générer des données co
H04N 21/45 - Opérations de gestion réalisées par le client pour faciliter la réception de contenu ou l'interaction avec le contenu, ou pour l'administration des données liées à l'utilisateur final ou au dispositif client lui-même, p.ex. apprentissage des préféren
H04N 21/466 - Procédé d'apprentissage pour la gestion intelligente, p.ex. apprentissage des préférences d'utilisateurs pour recommander des films
H04N 21/262 - Ordonnancement de la distribution de contenus ou de données additionnelles, p.ex. envoi de données additionnelles en dehors des périodes de pointe, mise à jour de modules de logiciel, calcul de la fréquence de transmission de carrousel, retardement d
Systems, devices, apparatuses, components, methods, and techniques for media a simple user interface that can facilitate discovery of contextually relevant media content with minimal navigation are provided. For example, the disclosed user interface may present contextually relevant categories, sub-categories and media content items while concurrently playing a media content item predicted to likely be selected by the user.
H04N 21/442 - Surveillance de procédés ou de ressources, p.ex. détection de la défaillance d'un dispositif d'enregistrement, surveillance de la bande passante sur la voie descendante, du nombre de visualisations d'un film, de l'espace de stockage disponible dans l
H04N 21/2668 - Création d'un canal pour un groupe dédié d'utilisateurs finaux, p.ex. en insérant des publicités ciblées dans un flux vidéo en fonction des profils des utilisateurs finaux
H04N 21/45 - Opérations de gestion réalisées par le client pour faciliter la réception de contenu ou l'interaction avec le contenu, ou pour l'administration des données liées à l'utilisateur final ou au dispositif client lui-même, p.ex. apprentissage des préféren
H04N 21/472 - Interface pour utilisateurs finaux pour la requête de contenu, de données additionnelles ou de services; Interface pour utilisateurs finaux pour l'interaction avec le contenu, p.ex. pour la réservation de contenu ou la mise en place de rappels, pour la requête de notification d'événement ou pour la transformation de contenus affichés
69.
SYSTEMS AND METHODS FOR SEQUENCING A PLAYLIST OF MEDIA ITEMS
A server system receives a request to generate a playlist. The playlist includes a sequence of media items. The server system receives a plurality of constraints that define disqualification criteria for excluding media items from a respective slot in the sequence of media items. The plurality of constraints for the respective slot in the sequence of media items includes at least one constraint that is based on already-populated slots in the sequence of media items. The server system generates the playlist by sequentially populating each respective slot in the sequence of media items, including selecting, for the respective slot, a respective media item that meets the plurality of constraints for the respective slot in the sequence of media items. The server system provides the playlist to a user of the media providing service.
H04N 21/262 - Ordonnancement de la distribution de contenus ou de données additionnelles, p.ex. envoi de données additionnelles en dehors des périodes de pointe, mise à jour de modules de logiciel, calcul de la fréquence de transmission de carrousel, retardement d
H04N 21/454 - Filtrage de contenu, p.ex. blocage des publicités
H04N 21/239 - Interfaçage de la voie montante du réseau de transmission, p.ex. établissement de priorité des requêtes de clients
70.
SYSTEMS AND METHODS FOR DETERMINING DESCRIPTORS FOR MEDIA CONTENT ITEMS
An electronic device obtains a plurality of collections of media content items, each collection of media content items being associated with text generated by one or more users of the media-providing service. Based on how frequently a first media content item co-occurs with a first descriptor in text for respective collections of media items that include the first media content item, the electronic device generates, without user input, a new collection of media content items for a first user. The new collection of media content items corresponds to the first descriptor and includes the first media content item. The electronic device presents the new collection of media content items to the first user as a recommendation.
G06F 16/908 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G06F 16/9535 - Adaptation de la recherche basée sur les profils des utilisateurs et la personnalisation
G06F 16/68 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
Technology for generating, reading, and using machine-readable codes is disclosed. There is a method, performed by an image capture device, for reading and using the codes. The method includes obtaining an image, identifying an area in the image having a machine-readable code. The method also includes, within the image area, finding a predefined start marker defining a start point and a predefined stop marker defining a stop point, an axis being defined there between. A plurality of axis points can be defined along the axis. For each axis point, a first distance within the image area to a mark is determined. The distance can be measured from the axis point in a first direction which is orthogonal to the axis. The first distances can be converted to a binary code using Gray code such that each first distance encodes at least one bit of data in the code.
G06K 19/06 - Supports d'enregistrement pour utilisation avec des machines et avec au moins une partie prévue pour supporter des marques numériques caractérisés par le genre de marque numérique, p.ex. forme, nature, code
G06K 7/14 - Méthodes ou dispositions pour la lecture de supports d'enregistrement par radiation corpusculaire utilisant la lumière sans sélection des longueurs d'onde, p.ex. lecture de la lumière blanche réfléchie
72.
Display screen with animated graphical user interface
Audio content episodes are received. Using machine learning, one or more audio segments of interest are identified in each of the audio content episodes based at least in part on an analysis of content included in a corresponding audio content episode. Each of the identified audio segments is associated with one or more automatically determined tags. Using machine learning, a recommended audio segment is selected for a specific user from the identified audio segments based at least in part on attributes of the specific user and the automatically determined tags of the identified audio segments. The recommended audio segment are automatically provided in an audio segment feed.
G10L 25/51 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
G06F 3/14 - Sortie numérique vers un dispositif de visualisation
G10L 17/00 - Identification ou vérification du locuteur
G06N 5/04 - Modèles d’inférence ou de raisonnement
G06F 16/638 - Présentation des résultats des requêtes
G06F 16/64 - Navigation; Visualisation à cet effet
A method comprises the following steps: providing a Gaussian process variational autoencoder (GP-VAE) including a Gaussian process (GP) encoder and a neural network decoder; selecting a plurality of inducing points in a data space; generating a mapping of the plurality of inducing points in a latent space; and training the GP-VAE using a training dataset.
A system, method and computer product for combining audio tracks. In one example embodiment herein, the method comprises determining at least one music track that is musically compatible with a base music track, aligning those tracks in time, and combining the tracks. In one example embodiment herein, the tracks may be music tracks of different songs, the base music track can be an instrumental accompaniment track, and the at least one music track can be a vocal track. Also in one example embodiment herein, the determining is based on musical characteristics associated with at least one of the tracks, such as an acoustic feature vector distance between tracks, a likelihood of at least one track including a vocal component, a tempo, or musical key. Also, determining of musical compatibility can include determining at least one of a vertical musical compatibility or a horizontal musical compatibility among tracks.
G10H 1/00 - INSTRUMENTS DE MUSIQUE ÉLECTROPHONIQUES; INSTRUMENTS DANS LESQUELS LES SONS SONT PRODUITS PAR DES MOYENS ÉLECTROMÉCANIQUES OU DES GÉNÉRATEURS ÉLECTRONIQUES, OU DANS LESQUELS LES SONS SONT SYNTHÉTISÉS À PARTIR D'UNE MÉMOIRE DE DONNÉES Éléments d'instruments de musique électrophoniques
Apparatus, methods and computer-readable medium are provided for processing wind noise. Audio input is processed by receiving an audio input. A wind noise level representative of a wind noise at the microphone array is measured using the audio input and a determination is made, based on the wind noise level, whether to perform either (i) a wind noise suppression process on the audio input on-device, or (ii) the wind noise suppression process on the audio input on-device and an audio reconstruction process in-cloud.
G10L 21/0232 - Traitement dans le domaine fréquentiel
H04R 1/40 - Dispositions pour obtenir la fréquence désirée ou les caractéristiques directionnelles pour obtenir la caractéristique directionnelle désirée uniquement en combinant plusieurs transducteurs identiques
A server receives a request to play a selected playlist from a first electronic device associated with a host listener. The playlist includes audio items having a common attribute. The server also receives an identity of a guest listener having a second electronic device and retrieves an indication of taste of the guest listener based on the received identity. The server selects an additional audio item based at least in part on the indication of taste of the guest listener and the common attribute of the audio items of the selected playlist, and incorporates the additional audio item into the selected playlist.
Systems, devices, apparatuses, components, methods, and techniques for generating and playing a selectable content depth media program are provided. Media content items are edited to produce selectable depth media segments which are assembled into selectable depth media programs. A media-playback device is configured to navigate and play the selectable depth media program through interaction by a listening user. The user selects the desired content depth for each media segment.
H04N 21/482 - Interface pour utilisateurs finaux pour la sélection de programmes
H04N 21/2387 - Traitement de flux en réponse à une requête de reproduction par un utilisateur final, p.ex. pour la lecture à vitesse variable ("trick play")
81.
METHODS AND SYSTEMS FOR PROVISIONING SETTINGS OF A MEDIA PLAYBACK DEVICE
A system is provided for streaming media content in a vehicle. The system includes a personal media streaming appliance system configured to connect to a media delivery system and receive media content from the media delivery system at least via a cellular network. The media delivery system is configured to link a user media streaming account with a particular personal media streaming appliance to provide personalized media content to the appliance. Media contexts are assigned to multiple preset settings automatically so that the personal media streaming appliance system is configured to output personalized media content upon first use.
H04L 41/0806 - Réglages de configuration pour la configuration initiale ou l’approvisionnement, p.ex. prêt à l’emploi [plug-and-play]
H04L 41/08 - Gestion de la configuration des réseaux ou des éléments de réseau
G06Q 30/06 - Transactions d’achat, de vente ou de crédit-bail
H04L 65/612 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour monodiffusion [unicast]
G06Q 10/08 - Logistique, p.ex. entreposage, chargement ou distribution; Gestion d’inventaires ou de stocks
H04L 65/1063 - Serveurs d'applications fournissant des services réseau
A server determines, at a first predetermined time, a default decision as to whether to provide a first media content clip after the end of a first media content item. At a second predetermined time, after the first predetermined time, the server initiates a determination of a first decision as to whether to provide the first media content clip after the end of the first media content item. In accordance with the first decision being reached within a predetermined latency period, the server provides the first media content clip to the first electronic device in accordance with the first decision. In accordance with a determination that the predetermined latency period has elapsed without the first decision being reached, provides the first media content clip to the first electronic device in accordance with the default decision.
H04L 65/61 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet
84.
SYSTEMS AND METHODS FOR GENERATING TRAILERS FOR AUDIO CONTENT
An electronic device receives an audio file and divides the audio file into a plurality of segments. The electronic device, automatically, without user input, determines, for each segment, a descriptor from a plurality of descriptors and a value of the descriptor for the segment. The electronic device selects one or more segments of the plurality of segments, based on a comparison of the respective values of respective descriptors for respective segments and genre-specific criteria selected based on a genre of the audio file. The electronic device generates a trailer for the audio file using the selected one or more segments.
G10L 25/63 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour estimer un état émotionnel
G10L 25/30 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes caractérisées par la technique d’analyse utilisant des réseaux neuronaux
G10L 15/04 - Segmentation; Détection des limites de mots
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p.ex. dialogue homme-machine
85.
System and method for generating models representing users of a media providing service
A method of recommending media items to a user is provided. The method includes receiving historical data for a user of a media providing service. The historical data indicates past interactions of the user with media items. The method includes generating a model of the user. The model includes a first set of parameters, each of the first set of parameters quantifying a predicted latent preference of the user for a respective media item provided by the media providing service. The method includes evaluating the predicted latent preferences of the user for the respective media items against the historical data indicating the past interactions of the user with the media items provided by the media providing service. The method includes selecting a recommender system from a plurality of recommender systems using the model of the user, including the first set of parameters. The method includes providing a media item to a second user using the selected recommender system.
An electronic device associated with a media-providing service receives a first media item and a request, from a second device, for playback of the first media content item. The electronic device determines an insertion time within the first media content item for inserting a second media content item, and generates a queue indicating an order in which a first, second, and third file are to be provided. The first file corresponds to a portion of the first media content item from a start of the first media content item until the insertion time, the second file corresponds to the second media content item, and the third file corresponds to a portion of the first media content item starting at the insertion time. The electronic device generates the files, and queues the second electronic device to play back the first, second, and the third files in accordance with the queue.
H04N 21/433 - Opération de stockage de contenu, p.ex. opération de stockage en réponse à une requête de pause ou opérations de cache
H04N 21/262 - Ordonnancement de la distribution de contenus ou de données additionnelles, p.ex. envoi de données additionnelles en dehors des périodes de pointe, mise à jour de modules de logiciel, calcul de la fréquence de transmission de carrousel, retardement d
H04N 21/658 - Transmission du client vers le serveur
H04N 21/414 - Plate-formes spécialisées de client, p.ex. récepteur au sein d'une voiture ou intégré dans un appareil mobile
This disclosure is directed to systems and methods for managing a group session for consuming media content across a plurality of devices. In some configurations and by non-limiting example, the group session operates to synchronize playback and control of media content at the plurality of devices. In one aspect a method of simultaneously playing media content on a plurality of media playback devices for a group session is disclosed.
A server obtains user data for a respective user, including data corresponding to the respective user's consumption of media in a first content domain. Before obtaining, for the respective user, data corresponding to a second content domain, the server uses a neural network to generate a user embedding for the respective user based on the user data. The server generates, for a plurality of content items of the second content domain consumed by users other than the respective user, a respective content item embedding. The respective content item embedding is based on user embeddings of the at least one user other than the respective user. The server system determines, using the user embedding for the respective user and respective content item embeddings, a first content item in the second content domain that meets matching criteria for the respective user and provides the first content item to the respective user.
G06F 16/435 - Filtrage basé sur des données supplémentaires, p.ex. sur des profils d'utilisateurs ou de groupes
G06F 16/438 - Présentation des résultats des requêtes
G06F 16/483 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
89.
SYSTEMS AND METHODS FOR COMMUNICATING WITH A DEVICE IN A LOW POWER MODE
A first server system is configured to communicate with a first client device through a first application executing on the first client device. The first server system determines that communication with the first client device through the first application has been lost due to the first client device entering an idle mode. The first server system receives a request from a second client device that triggers reestablishing communication with the first client device through the first application. In response, the first server system transmits a request to a second server system to wake the first client device from the idle mode. The first server system receives, from the first application on the first client device, an indication that communication has been reestablished between the first server system and the first application. The first server system transmits a control command to control the first client device.
A method of processing playback content control commands generated at a client device and communicated by a backend server to the client device and a controlled device to control media content playback at the controlled device is provided. The method includes the following steps: sending, by the client device, a playback content control command to the backend server, wherein the backend server is configured to communicated the playback content control command to the client device and to the controlled device; initiating a buffer time period; and refraining from processing, at the client device, one or more subsequent playback content control commands from the backend server during the buffer time period.
A system for device-to-device media capturing is described herein. An example system includes a media delivery system of a media service and at least a first and second device associated with respective first and second user accounts of the media service. The first device is also associated with an active media content item provided by the media delivery system that is automatically captured by the second device. For example, as the second device is moved proximate to the first device, one or more wireless communications are transmitted between the devices that trigger device-to-device media capturing. Resultantly, an identifier for the active media content item is stored to a library of the second user account of the media service. The identifier is stored in response to detecting the proximity of the devices and determining that the second device is moving towards the first device.
H04W 4/80 - Services utilisant la communication de courte portée, p.ex. la communication en champ proche, l'identification par radiofréquence ou la communication à faible consommation d’énergie
Systems, devices, apparatuses, components, methods, and techniques for saving media content to a context for later playback are provided. An example media-playback device for identifying and playing media content for a user traveling in a vehicle includes a context detecting device, a context-driven playback engine, and a media playback engine. Contexts are established by parameters that can be detected by a media-playback device. Contexts are situations that are defined by one or more locations, times, events, activities, people, and devices. Media content is saved to the contexts for later playback. The contexts are detected by the context detecting device, the associated media content is identified by the context-driven playback engine, and the media content is automatically played through the media playback engine, without additional input required by the user.
H04N 21/435 - Traitement de données additionnelles, p.ex. décryptage de données additionnelles ou reconstruction de logiciel à partir de modules extraits du flux de transport
H04N 21/462 - Gestion de contenu ou de données additionnelles, p.ex. création d'un guide de programmes électronique maître à partir de données reçues par Internet et d'une tête de réseau ou contrôle de la complexité d'un flux vidéo en dimensionnant la résolution o
H04N 21/466 - Procédé d'apprentissage pour la gestion intelligente, p.ex. apprentissage des préférences d'utilisateurs pour recommander des films
H04N 21/475 - Interface pour utilisateurs finaux pour acquérir des données d'utilisateurs finaux, p.ex. numéro d'identification personnel [PIN] ou données de préférences
G11B 27/022 - Montage électronique de signaux d'information analogiques, p.ex. de signaux audio, vidéo
H04N 21/458 - Ordonnancement de contenu pour créer un flux personnalisé, p.ex. en combinant une publicité stockée localement avec un flux d'entrée; Opérations de mise à jour, p.ex. pour modules de système d'exploitation
H04N 21/442 - Surveillance de procédés ou de ressources, p.ex. détection de la défaillance d'un dispositif d'enregistrement, surveillance de la bande passante sur la voie descendante, du nombre de visualisations d'un film, de l'espace de stockage disponible dans l
A source device being associated with an account uses playback of a media content item to cause a target device to become associated with the account. The target device enters an association mode and records a portion of the playing content. The target device provides the recording to a server that identifies the song (e.g., using a music fingerprint service) and uses the identification of the song to find the account that caused playback of the identified song. With the account identified, the server provides credentials of the account to target system. The target device accesses content or services using the account. As confirmation of receiving the credentials, the server causes playback of the content to transition to from the source device to the target device.
Methods, systems, and computer programs for generating a playlist of media content items that are popular with the friends of the first user. A first user taste profile is determined and a user taste profile is determined for each of a plurality of social connections. A similarity score is calculated between the first user taste profile and the user taste profile of each social connection. Media content items consumed by social connections with the highest similarity score are selected and placed in a playlist for the first user.
H04L 65/611 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour la multidiffusion ou la diffusion
H04L 65/612 - Diffusion en flux de paquets multimédias pour la prise en charge des services de diffusion par flux unidirectionnel, p.ex. radio sur Internet pour monodiffusion [unicast]
A system for device discovery for social playback is disclosed. The system operates to connect a host media playback device to a media output device and broadcast a social playback session to guest media playback devices. Upon joining a social playback session, a guest media playback device may control the media playback at the host media playback device. Where the media output for the social playback session is provided by the media output device.
H04L 65/60 - Diffusion en flux de paquets multimédias
H04W 4/80 - Services utilisant la communication de courte portée, p.ex. la communication en champ proche, l'identification par radiofréquence ou la communication à faible consommation d’énergie
96.
TEXT COMMAND BASED GROUP LISTENING SESSION PLAYBACK CONTROL
A method of providing a group listening session to users of a messaging platform includes: receiving, at a media streaming platform, a message feed from the messaging platform; parsing, at the media streaming platform, the message feed to identify one or more commands included in the message feed, wherein the one or more commands are associated with the group listening session provided by the media streaming platform; and controlling the group listening session according to the one or more commands, each of the users being associated with a media device for participating in the group listening session.
G06F 16/638 - Présentation des résultats des requêtes
G06F 16/683 - Recherche de données caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
A first electronic device, while a first media content item of a first type is playing back on a second electronic device, detects a user input selecting a second media content item. In response to the user input, the electronic device determines a type of media content item of the selected second media content item. In accordance with a determination that the second media content item is of the first type of media content item, the first electronic device generates a media control request for controlling playback of the second media content item at the second electronic device, and in accordance with a determination that the second media content item is of a second type of media content item, the first electronic device generates a media control request for controlling playback of the second media content item at a third electronic device that is distinct from the second electronic device.
H04L 65/60 - Diffusion en flux de paquets multimédias
H04N 21/433 - Opération de stockage de contenu, p.ex. opération de stockage en réponse à une requête de pause ou opérations de cache
H04N 21/443 - Procédés de système d'exploitation, p.ex. démarrage d'un boîtier décodeur STB, implémentation d'une machine virtuelle Java dans un boîtier décodeur STB ou gestion d'énergie dans un boîtier décodeur STB
H04N 21/472 - Interface pour utilisateurs finaux pour la requête de contenu, de données additionnelles ou de services; Interface pour utilisateurs finaux pour l'interaction avec le contenu, p.ex. pour la réservation de contenu ou la mise en place de rappels, pour la requête de notification d'événement ou pour la transformation de contenus affichés
H04L 69/329 - Protocoles de communication intra-couche entre entités paires ou définitions d'unité de données de protocole [PDU] dans la couche application [couche OSI 7]
98.
Systems and methods for jointly estimating sound sources and frequencies from audio
An electronic device receives a first audio content item that includes a plurality of sound sources. The electronic device generates a representation of the first audio content item. The electronic device determines, from the representation of the first audio content item: a representation of an isolated sound source, and frequency data associated with the isolated sound source. Determining the representation of the isolated sound source and the frequency data associated with the isolated sound source includes using a neural network to jointly determine the representation of the isolated sound source and the frequency data associated with the isolated sound source. The electronic device determines that a portion of a second audio content item matches the first audio content item using the representation of the isolated sound source and/or the frequency data associated with the isolated sound source.
G10L 25/51 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation
A system and method for controlling access to an on-device machine learning model without the use of encryption is described herein. For example, a request is received from an application executing on a device of a user. The request is to download a machine learning model to the device that enables a feature of the application, and the request includes information associated with the user and/or the device. The information is used to create an obfuscation key, and a derivative model can be generated using a reference copy of the machine learning model and the obfuscation key. The derivative model and the obfuscation key are then sent to the application. When the obfuscation key is provided to the derivative model at runtime, values derived from the obfuscation key are provided as additional inputs that enable the derivative model to function properly.
A method of integrating a playback device for use with a backend server of a media streaming platform includes the following steps: providing an application programming interface (API) command processor at a server to send and receive network communication with a cloud playback adapted system; receiving at the API command processor, from a cloud playback client associated with the cloud playback adapted system, a status of the cloud playback adapted system; receiving at the API command processor, from the cloud playback client, a playback command to control playback of a media content item; and sending a message from the API command processor to the cloud playback client in response to the playback command, the message including an identification of the media content item to permit the cloud playback adapted system to retrieve the media content item for playback of the media content item by the cloud playback adapted system.