Systems and methods for data processing are described. Example embodiments include identifying chart data corresponding to a visual element of a user interface; selecting an insight type based on a chart category of the chart data; generating insight data for the insight type based on the chart data using a statistical measure corresponding to the insight type; generating an insight caption for the insight type by combining the insight data with a sentence template corresponding to the insight type; and communicating the insight caption to a user of the user interface.
A method and system for outage forecasting are described. One or more aspects of the method and system include receiving, by a machine learning model, time series data for a service metric of a computer network; generating, by the machine learning model, probability distribution information for the service metric based on the time series data, wherein the probability distribution information is generated using a machine learning model that is trained using a distribution loss and a classification loss; and generating, by a forecasting component, outage forecasting information for the computer network based on the probability distribution information.
G06Q 10/04 - Prévision ou optimisation spécialement adaptées à des fins administratives ou de gestion, p. ex. programmation linéaire ou "problème d’optimisation des stocks"
G06N 7/00 - Agencements informatiques fondés sur des modèles mathématiques spécifiques
Systems and methods for image dense field based view calibration are provided. In one embodiment, an input image is applied to a dense field machine learning model that generates a vertical vector dense field (VVF) and a latitude dense field (LDF) from the input image. The VVF comprises a vertical vector of a projected vanishing point direction for each of the pixels of the input image. The latitude dense field (LDF) comprises a projected latitude value for the pixels of the input image. A dense field map for the input image comprising the VVF and the LDF can be directly or indirectly used for a variety of image processing manipulations. The VVF and LDF can be optionally used to derive traditional camera calibration parameters from uncontrolled images that have undergone undocumented or unknown manipulations.
The present disclosure relates to systems, non-transitory computer-readable media, and methods for utilizing a design language model and a generative language model to generate digital design documents with design variations. In particular embodiments, the disclosed systems implement the design language model to tokenize the design of a document into a sequence of language tokens. For example, the disclosed systems tokenize visual elements and a layout of the document—in addition to optional user-added content. The generative language model utilizes the sequence of language tokens to predict a next language token representing a suggested design variation. Based on the predicted language token, the disclosed systems generate a modified digital design document visually portraying the suggested design variation. Further, in one or more embodiments, the disclosed systems perform iterative refinements to the modified digital design document.
G06F 3/04845 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p.ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs pour la transformation d’images, p.ex. glissement, rotation, agrandissement ou changement de couleur
G06F 40/106 - Affichage de la mise en page des documents; Prévisualisation
G06F 40/284 - Analyse lexicale, p.ex. segmentation en unités ou cooccurrence
Certain aspects and features of this disclosure relate to chromatic undertone detection. For example, a method involves receiving an image file and producing, using a color warmth classifier, an image warmth profile from the image file. The method further involves applying a surface-image-trained machine-learning model to the image warmth profile to produce an inferred undertone value for the image file. The method further involves comparing, using a recommendation module, and the inferred undertone value, an image color value to a plurality of pre-existing color values corresponding to a database of production images, and causing, in response to the comparing, interactive content including the at least one production image selection from the database of production images to be provided on a recipient device.
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p.ex. des objets vidéo
G06V 10/75 - Appariement de motifs d’image ou de vidéo; Mesures de proximité dans les espaces de caractéristiques utilisant l’analyse de contexte; Sélection des dictionnaires
G06V 10/56 - Extraction de caractéristiques d’images ou de vidéos relative à la couleur
G06V 10/60 - Extraction de caractéristiques d’images ou de vidéos relative aux propriétés luminescentes, p.ex. utilisant un modèle de réflectance ou d’éclairage
G06V 10/774 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source méthodes de Bootstrap, p.ex. "bagging” ou “boosting”
Embodiments provide systems, methods, and computer storage media for management, assessment, navigation, and/or discovery of data based on data quality, consumption, and/or utility metrics. Data may be assessed using attribute-level and/or record-level metrics that quantify data: “quality” - the condition of data (e.g., presence of incorrect or incomplete values), its “consumption” - the tracked usage of data in downstream applications (e.g., utilization of attributes in dashboard widgets or customer segmentation rules), and/or its “utility” - a quantifiable impact resulting from the consumption of data (e.g., revenue or number of visits resulting from marketing campaigns that use particular datasets, storage costs of data). This data assessment may be performed at different stages of a data intake, preparation, and/or modeling lifecycle. For example, current and historical data metrics may be periodically aggregated, persisted, and/or monitored to facilitate discovery and removal of less effective data from a data lake.
G06F 16/2457 - Traitement des requêtes avec adaptation aux besoins de l’utilisateur
G06F 16/25 - Systèmes d’intégration ou d’interfaçage impliquant les systèmes de gestion de bases de données
G06F 16/215 - Amélioration de la qualité des données; Nettoyage des données, p.ex. déduplication, suppression des entrées non valides ou correction des erreurs typographiques
The present disclosure relates to systems, non-transitory computer-readable media, and methods for training and/or implementing machine learning models utilizing compressed log scene measurement maps. For example, the disclosed system generates compressed log scene measurement maps by converting scene measurement maps to compressed log scene measurement maps by applying a logarithmic function. In particular, the disclosed system uses scene measurement distribution metrics from a digital image to determine a base for the logarithmic function. In this way, the compressed log scene measurement maps normalize ranges within a digital image and accurately differentiates between scene elements objects at a variety of depths. Moreover, for training, the disclosed system generates a predicted scene measurement map via a machine learning model and compares the predicted scene measurement map with a compressed log ground truth map. By doing so, the disclosed system trains the machine learning model to generate accurate compressed log depth maps.
Certain embodiments involve a graphics manipulation application using brushstroke parameters that include a maximum alpha-deposition parameter and a fractional alpha-deposition parameter. For instance, the graphics manipulation application uses an alpha flow increment computed from the maximum alpha-deposition parameter and the fractional alpha-deposition parameter to compute an output canvas color. In some embodiments, if the current canvas opacity exceeds or equals the maximum alpha-deposition parameter, the current canvas opacity is selected as the output canvas opacity. Otherwise, the graphics manipulation application computes the output canvas opacity by increasing the current canvas opacity based on the alpha flow increment. The graphics manipulation application updates a canvas portion affected by a brushstroke input to include the output canvas opacity and the output canvas color.
G06F 3/04883 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] utilisant des caractéristiques spécifiques fournies par le périphérique d’entrée, p.ex. des fonctions commandées par la rotation d’une souris à deux capteurs, ou par la nature du périphérique d’entrée, p.ex. des gestes en fonction de la pression exer utilisant un écran tactile ou une tablette numérique, p.ex. entrée de commandes par des tracés gestuels pour l’entrée de données par calligraphie, p.ex. sous forme de gestes ou de texte
G06F 3/0482 - Interaction avec des listes d’éléments sélectionnables, p.ex. des menus
G06T 11/20 - Traçage à partir d'éléments de base, p.ex. de lignes ou de cercles
G06T 11/40 - Remplissage d'une surface plane par addition d'attributs de surface, p.ex. de couleur ou de texture
G06T 11/60 - Edition de figures et de texte; Combinaison de figures ou de texte
G06F 3/0354 - Dispositifs de pointage déplacés ou positionnés par l'utilisateur; Leurs accessoires avec détection des mouvements relatifs en deux dimensions [2D] entre le dispositif de pointage ou une partie agissante dudit dispositif, et un plan ou une surface, p.ex. souris 2D, boules traçantes, crayons ou palets
G06T 11/80 - Création ou modification d'une image dessinée ou peinte à la main en utilisant un dispositif manuel d'entrée, p.ex. une souris, un crayon lumineux, des touches de direction sur le clavier
9.
SYSTEMS AND METHODS FOR COLOR PALETTE OPTIMIZATION
A method and system for color optimization in generated images are described. The method and system include receiving an image generation prompt that includes a text description of target image content and color information describing a target color palette; encoding the image generation prompt to obtain image features that represent the target image content and the target color palette; and generating an image representing the target image content with the target color palette based on the image features.
G06T 11/80 - Création ou modification d'une image dessinée ou peinte à la main en utilisant un dispositif manuel d'entrée, p.ex. une souris, un crayon lumineux, des touches de direction sur le clavier
G06F 16/583 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement a dual-branched neural network architecture to harmonize composite images. For example, in one or more implementations, the transformer-based harmonization system uses a convolutional branch and a transformer branch to generate a harmonized composite image based on an input composite image and a corresponding segmentation mask. More particularly, the convolutional branch comprises a series of convolutional neural network layers followed by a style normalization layer to extract localized information from the input composite image. Further, the transformer branch comprises a series of transformer neural network layers to extract global information based on different resolutions of the input composite image. Utilizing a decoder, the transformer-based harmonization system combines the local information and the global information from the corresponding convolutional branch and transformer branch to generate a harmonized composite image.
Machine-learning model retargeting techniques are described. In one example, training data is generated by extrapolating feedback data collected from entities. These techniques supports an ability to identify a wider range of thresholds and corresponding entities than those available in the feedback data. This also provides an opportunity to explore additional thresholds than those used in the past through extrapolating operations outside of a range used to define a segment, for which, the feedback data is captured. These techniques also support retargeting of a machine-learning model for a secondary label that is different than a primary label used to initially train the machine-learning model.
The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize a graph neural network to generate data recommendations. The disclosed systems generate a digital graph representation comprising user nodes corresponding to users, data attribute nodes corresponding to data attributes, and edges reflecting historical interactions between the users and the data attributes; Moreover, the disclosed systems generate, utilizing a graph neural network, user embeddings for the user nodes and data attribute embeddings for the data attribute nodes from the digital graph representation. In addition, the disclosed systems generate, utilizing a graph neural network, user embeddings for the user nodes and data attribute embeddings for the data attribute nodes from the digital graph representation. Furthermore, the disclosed systems determine a data recommendation for a target user utilizing the data attribute embeddings and a target user embedding corresponding to the target user from the user embeddings.
The present disclosure relates to systems, methods, and non-transitory computer readable media for remotely generating modified digital images utilizing an interactive image editing architecture. For example, the disclosed systems receive an image editing request for remotely editing a digital image utilizing an interactive image editing architecture. In some cases, the disclosed systems maintain, via a canvas worker container, a digital stream that reflects versions of the digital image. The disclosed systems determine, from the digital stream utilizing the canvas worker container, an image differential metric indicating a difference between a first version of the digital image and a second version of the digital image associated with the image editing request. Further, the disclosed systems provide the image differential metric to a client device for rendering the second version of the digital image to reflect a modification corresponding to the user interaction.
The present disclosure relates to systems, methods, and non-transitory computer-readable media to enhance texture image delivery and processing at a client device. For example, the disclosed systems can utilize a server-side compression combination that includes, in sequential order, a first compression pass, a decompression pass, and a second compression pass. By applying this compression combination to a texture image at the server-side, the disclosed systems can leverage both GPU-friendly and network-friendly image formats. For example, at a client device, the disclosed system can instruct the client device to execute a combination of decompression-compression passes on a GPU-network-friendly image delivered over a network connection to the client device. In so doing, client device can generate a tri-pass-compressed-texture from a decompressed image comprising texels with color palettes based on previously reduced color palettes from the first compression pass at the server-side, which reduces computational overhead and increases performance speed.
H04N 19/192 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par le procédé d’adaptation, l’outil d’adaptation ou le type d’adaptation utilisés pour le codage adaptatif le procédé d’adaptation, l’outil d’adaptation ou le type d’adaptation étant itératif ou récursif
H04N 19/186 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’unité de codage, c. à d. la partie structurelle ou sémantique du signal vidéo étant l’objet ou le sujet du codage adaptatif l’unité étant une couleur ou une composante de chrominance
H04N 19/176 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’unité de codage, c. à d. la partie structurelle ou sémantique du signal vidéo étant l’objet ou le sujet du codage adaptatif l’unité étant une zone de l'image, p.ex. un objet la zone étant un bloc, p.ex. un macrobloc
15.
Automated Digital Document Generation from Digital Videos
Techniques are described that support automated generation of a digital document from digital videos using machine learning. The digital document includes textual components that describe a sequence of entity and action descriptions from the digital video. These techniques are usable to generate a single digital document based on a plurality of digital videos as well as incorporate user-specified constraints in the generation of the digital document.
G06F 40/166 - Traitement de texte Édition, p.ex. insertion ou suppression
G06V 10/86 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant des correspondances graphiques
In implementations of systems for image inversion using multiple latent spaces, a computing device implements an inversion system to generate a segment map that segments an input digital image into a first image region and a second image region and assigns the first image region to a first latent space and the second image region to a second latent space that corresponds to a layer of a convolutional neural network. An inverted latent representation of the input digital image is computed using a binary mask for the second image region. The inversion system modifies the inverted latent representation of the input digital image using an edit direction vector that corresponds to a visual feature. An output digital image is generated that depicts a reconstruction of the input digital image having the visual feature based on the modified inverted latent representation of the input digital image.
Embodiments provide systems, methods, and computer storage media for management, assessment, navigation, and/or discovery of data based on data quality, consumption, and/or utility metrics. Data may be assessed using attribute-level and/or record-level metrics that quantify data: “quality”—the condition of data (e.g., presence of incorrect or incomplete values), its “consumption”—the tracked usage of data in downstream applications (e.g., utilization of attributes in dashboard widgets or customer segmentation rules), and/or its “utility”—a quantifiable impact resulting from the consumption of data (e.g., revenue or number of visits resulting from marketing campaigns that use particular datasets, storage costs of data). This data assessment may be performed at different stages of a data intake, preparation, and/or modeling lifecycle. For example, an interactive tree view may visually represent a nested attribute schema and attribute quality or consumption metrics to facilitate discovery of bad data before ingesting into a data lake.
G06Q 10/06 - Ressources, gestion de tâches, des ressources humaines ou de projets; Planification d’entreprise ou d’organisation; Modélisation d’entreprise ou d’organisation
18.
DATA SELECTION BASED ON CONSUMPTION AND QUALITY METRICS FOR ATTRIBUTES AND RECORDS OF A DATASET
Embodiments provide systems, methods, and computer storage media for management, assessment, navigation, and/or discovery of data based on data quality, consumption, and/or utility metrics. Data may be assessed using attribute-level and/or record-level metrics that quantify data: “quality”—the condition of data (e.g., presence of incorrect or incomplete values), its “consumption”—the tracked usage of data in downstream applications (e.g., utilization of attributes in dashboard widgets or customer segmentation rules), and/or its “utility”—a quantifiable impact resulting from the consumption of data (e.g., revenue or number of visits resulting from marketing campaigns that use particular datasets, storage costs of data). This data assessment may be performed at different stages of a data intake, preparation, and/or modeling lifecycle. For example, a data selection interface may filter based on consumption and/or quality metrics to facilitate discovery of more effective data for machine learning model training, data visualization, or marketing campaigns.
In implementations of systems for tracking receptacles in physical environments, a computing device implements a tracking system to receive radio wave data describing first radio waves received by a first radio frequency antenna from a radio frequency tag embedded in a physical receptacle within a first region of a physical environment. The first radio waves indicate a unique identifier of the radio frequency tag. The tracking system computes an amount of time that the physical receptacle is within the first region based on the unique identifier of the radio frequency tag. An item is identified that does not include a radio frequency tag based on the amount of time and a unique identifier of the first radio frequency antenna. The tracking system generates an indication of information related to the item for display in a user interface of a display device disposed in a second region of the physical environment.
Vector object path segment editing techniques are described that retain edibility of a path while supporting editing of a segment included within the path, individually and separately, without editing other segments of path. A vector object editing module first retrieves information on segments included in a path of a vector object. The vector object editing module then renders a selected segment separately from an adjacent segment based on the selected segment model. An editing operation is then applied to the selected segment as specified via the user interface, e.g., to change color, width, or other display characteristic. The vector object editing module then generates a joint between the edited segment and the adjacent segment to provide a transition between the segments that mimics inclusion as a single path that contains those segments.
The present disclosure relates to a digital asset synchronization system that provides improved local and remote synchronization of digital assets. In particular, the digital asset synchronization system manages digital assets by separating each digital asset into multiple components stored as a set of distributed individual files. Employing individual components for a digital asset rather than single monolithic file enables the digital asset synchronization system to provide safe concurrent access to the digital asset from multiple applications on the same device and across different devices. In addition, using components for a digital asset provides the digital asset synchronization system with the ability to efficiently store and synchronize multiple versions of the digital asset, both locally and remotely.
G06F 16/27 - Réplication, distribution ou synchronisation de données entre bases de données ou dans un système de bases de données distribuées; Architectures de systèmes de bases de données distribuées à cet effet
G06F 16/22 - Indexation; Structures de données à cet effet; Structures de stockage
G06F 16/901 - Indexation; Structures de données à cet effet; Structures de stockage
22.
GENERATING AN IMAGE MASK FOR A DIGITAL IMAGE BY UTILIZING A MULTI-BRANCH MASKING PIPELINE WITH NEURAL NETWORKS
Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing a plurality of neural networks in a multi-branch pipeline to generate image masks for digital images. Specifically, the disclosed system can classify a digital image as a portrait or a non-portrait image. Based on classifying a portrait image, the disclosed system can utilize separate neural networks to generate a first mask portion for a portion of the digital image including a defined boundary region and a second mask portion for a portion of the digital image including a blended boundary region. The disclosed system can generate the mask portion for the blended boundary region by utilizing a trimap generation neural network to automatically generate a trimap segmentation including the blended boundary region. The disclosed system can then merge the first mask portion and the second mask portion to generate an image mask for the digital image.
A system and method for content distribution without tracking is described. The system and method includes determining that device identifiers are not available for a first digital content channel; identifying a first cluster of users and a second cluster of users based on the determination that device identifiers are not available; providing first content and second content via the first digital content channel; monitoring user interactions on the first digital content channel to obtain a first conversion rate for users in the first cluster that receive the first content and a second conversion rate for users in the second cluster that receive the second content; computing a cross-cluster treatment effect based on the first conversion rate and the second conversion rate; computing a treatment effect for the first content based on the cross-cluster treatment effect; and providing the first content to a subsequent user based on the treatment effect.
The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize intelligent contextual bias weights for informing keyphrase relevance models to extract keyphrases. For example, the disclosed systems generate a graph from a digital document by mapping words from the digital document to nodes of the graph. In addition, the disclosed systems determine named entity bias weights for the nodes of the graph utilizing frequencies with which the words corresponding to the nodes appear within named entities identified from the digital document. Moreover, the disclosed systems generate a keyphrase summary for the digital document utilizing the graph and a machine learning model biased according to the named entity bias weights for the nodes of the graph.
G06V 30/416 - Extraction de la structure logique, p.ex. chapitres, sections ou numéros de page; Identification des éléments de document, p.ex. des auteurs
G06V 30/19 - Reconnaissance utilisant des moyens électroniques
Certain aspects and features of this disclosure relate to partitioning machine learning models. For example, a method includes accessing a machine learning model configured for processing a data object and partitioning the machine learning model into a number of partitions. Each of the partitions of the machine learning model is characterized with respect to runtime requirements. Each of the partitions of the machine learning model is executed using a runtime environment corresponding to runtime requirements of the respective partition to process the data object. Output can be rendered based on the processing of the data object.
Methods, systems, and non-transitory computer readable storage media are disclosed for generating digital chain pull paintings in digital images. The disclosed system digitally animates a chain pull painting from a digital drawing path by determining a plurality of digital bead points along the digital drawing path. In response to a movement of one of the digital bead points from a first position to a second position (e.g., based on a pull input performed at a selected digital bead point), the disclosed system determines updated positions of one or more digital bead points along the path. The disclosed system also generates one or more strokes in the digital image from previous positions of the digital bead points to the updated positions of the digital bead points.
An image generation system generates images of objects under different lighting conditions. An image of an object and lighting conditions for an output image are received. The lighting conditions may specify, for instance, a location and/or color of one or more light sources. The image of the object is decomposed into a shading component and a reflectance component. A machine learning model takes the reflectance component and specified lighting conditions as input, and generates an output image of the object under the specified lighting conditions. In some configurations, the machine learning model may be trained on images of objects labeled with object classes, and the output image may be generated by also providing an object class of the object in the image as input to the machine learning model
G06V 10/60 - Extraction de caractéristiques d’images ou de vidéos relative aux propriétés luminescentes, p.ex. utilisant un modèle de réflectance ou d’éclairage
G06V 20/70 - RECONNAISSANCE OU COMPRÉHENSION D’IMAGES OU DE VIDÉOS Éléments spécifiques à la scène Étiquetage du contenu de scène, p.ex. en tirant des représentations syntaxiques ou sémantiques
G06V 10/56 - Extraction de caractéristiques d’images ou de vidéos relative à la couleur
G06V 10/778 - Apprentissage de profils actif, p.ex. apprentissage en ligne des caractéristiques d’images ou de vidéos
Systems and methods for resource allocation are described. The systems and methods include receiving utilization data for computing resources shared by a plurality of users, updating a pricing agent using a reinforcement learning model based on the utilization data, identifying resource pricing information using the pricing agent, and allocating the computing resources to the plurality of users based on the resource pricing information.
G06Q 30/02 - Marketing; Estimation ou détermination des prix; Collecte de fonds
G06Q 10/06 - Ressources, gestion de tâches, des ressources humaines ou de projets; Planification d’entreprise ou d’organisation; Modélisation d’entreprise ou d’organisation
G06F 9/50 - Allocation de ressources, p.ex. de l'unité centrale de traitement [UCT]
29.
DYNAMIC PATH ANIMATION OF ANIMATION LAYERS AND DIGITAL DESIGN OBJECTS
The present disclosure relates to systems, non-transitory computer-readable media, and methods for generating and modifying digital animations based on user interactions with a unique user interface portraying a one-dimensional layer motion element and/or elements for generating and utilizing animation path for digital design objects and animation layers. The disclosed system can provide a dynamic one-dimensional layer motion element that adapts to a selected animation layer and portrays selectable animation frames from the animation layer. The disclosed systems can provide options for generating and modifying various frames of the digital animation based on user interactions with the one-dimensional layer motion element, an animation timeline, and/or a corresponding animation canvas. Additionally, in some embodiments, the disclosed systems also generate path animations with complex animation effects based on user selection of animation paths, digital design objects of animation layers, and corresponding selectable path animation feature tools.
G06T 11/60 - Edition de figures et de texte; Combinaison de figures ou de texte
G06F 3/04845 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p.ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs pour la transformation d’images, p.ex. glissement, rotation, agrandissement ou changement de couleur
G06F 3/04847 - Techniques d’interaction pour la commande des valeurs des paramètres, p.ex. interaction avec des règles ou des cadrans
30.
PROVIDING AND UTILIZING A ONE-DIMENSIONAL LAYER MOTION ELEMENT TO GENERATE AND MANAGE DIGITAL ANIMATIONS
The present disclosure relates to systems, non-transitory computer-readable media, and methods for generating and modifying digital animations based on user interactions with a unique user interface portraying a one-dimensional layer motion element and/or elements for generating and utilizing animation path for digital design objects and animation layers. The disclosed system can provide a dynamic one-dimensional layer motion element that adapts to a selected animation layer and portrays selectable animation frames from the animation layer. The disclosed systems can provide options for generating and modifying various frames of the digital animation based on user interactions with the one-dimensional layer motion element, an animation timeline, and/or a corresponding animation canvas. Additionally, in some embodiments, the disclosed systems also generate path animations with complex animation effects based on user selection of animation paths, digital design objects of animation layers, and corresponding selectable path animation feature tools.
G06F 3/04845 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p.ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs pour la transformation d’images, p.ex. glissement, rotation, agrandissement ou changement de couleur
Systems and methods for email processing are described. Embodiments of the present disclosure identify a plurality of email recipients, wherein each of the plurality of email recipients has an email address associated with one of a plurality of internet service providers (ISPs); identify a plurality of internet protocol (IP) addresses, wherein each of the plurality of IP addresses is available for sending email from a user to the plurality of email recipients; compute an ISP-IP score for each of a plurality of ISP-IP pairs based on email delivery statistics; select an IP address from the plurality of IP addresses corresponding to each of the plurality of email recipients based on the ISP-IP score; and transmit an email to each of the plurality of email recipients from the selected IP address.
H04L 51/48 - Adressage des messages, p.ex. format des adresses ou messages anonymes, alias
H04L 51/224 - Surveillance ou traitement des messages en fournissant une notification sur les messages entrants, p.ex. des poussées de notifications des messages reçus
H04L 51/23 - Contrôles de fiabilité, p.ex. acquittements ou signalement de fautes
32.
ENTERPRISE APPLICATIONS DRIVEN BY COMMON METADATA REPOSITORY
Systems and methods for enterprise applications supported by common metadata repository are described. One or more aspects of the systems and methods include storing a plurality of entity schemas in a metadata repository, wherein each of the plurality of entity schemas corresponds to a different entity service from a plurality of entity services that interact with an application; storing a plurality of extension schemas in the metadata repository, wherein each of the plurality of extension schemas corresponds to a different extension service from a plurality of extension services utilized by the application; receiving, at the metadata repository from an extension service of the plurality of extension services, an entity schema request indicating an entity schema corresponding to an entity service of the plurality of entity services; and providing, from the metadata repository to the extension service, the entity schema in response to the entity schema request.
Systems and methods for configuring data stream filtering are disclosed. In one embodiment, a method for data stream processing comprises receiving an incoming dataset stream at a data stream processing environment, wherein the dataset stream comprises a data stream; configuring with a streaming data filter configuration tool, one or more filter parameters for a data filter that receives the data stream; computing with the streaming data filter configuration tool, one or more filter statistics estimates based on the filter parameters, wherein the filter statistics estimates are computed from sample elements of a representative sample of the data stream retrieved from a representative sample data store; outputting to a workstation user interface the filter statistics estimates; and configuring the data filter to apply the filter parameters to the data stream in response to an instruction from the workstation user interface.
Embodiments are disclosed for eliminating typographical errors from an electronic document. The method may include obtaining an electronic document comprising a plurality of text paragraphs. The method may further include detecting a plurality of typographical errors in the plurality of text paragraphs. The method may further include indexing a set of error paragraphs, wherein each paragraph in the set of error paragraphs includes at least one typographical error. The method may further include determining a priority for each typographical error based on a magnitude of the typographical error. The method may further include adjusting one or more attributes of each paragraph in the set of error paragraphs based on the priority for each typographical error.
Techniques are described for error log anomaly detection. In an implementation, error logs from an application are processed to generate training data. The error logs, for instance, are processed to remove personal information and other data such as numerical strings. The processed error logs are converted into embeddings to generate the training data. The training data is utilized to train an anomaly detection model. For instance, as part of training the anomaly detection model, an anomaly threshold is defined based on a loss value determined from output of the anomaly detection model. Further error logs from the application are then processed by the trained anomaly detection model to determine which of the further error logs are error anomalies, such as based on comparing loss values for the further error logs to the anomaly threshold.
The present disclosure describes systems, non-transitory computer-readable media, and methods for accurately and efficiently removing objects from digital images taken from a camera viewfinder stream. For example, the disclosed systems access digital images from a camera viewfinder stream in connection with an undesired moving object depicted in the digital images. The disclosed systems generate a temporal window of the digital images concatenated with binary masks indicating the undesired moving object in each digital image. The disclosed systems further utilizes a generator as part of a 3D to 2D generative adversarial neural network in connection with the temporal window to generate a target digital image with the region associated with the undesired moving object in-painted. In at least one embodiment, the disclosed systems provide the target digital image to a camera viewfinder display to show a user how a future digital photograph will look without the undesired moving object.
G06T 7/73 - Détermination de la position ou de l'orientation des objets ou des caméras utilisant des procédés basés sur les caractéristiques
G06F 18/2134 - Extraction de caractéristiques, p.ex. en transformant l'espace des caractéristiques; Synthétisations; Mappages, p.ex. procédés de sous-espace basée sur des critères de séparation, p.ex. analyse en composantes indépendantes
H04N 23/63 - Commande des caméras ou des modules de caméras en utilisant des viseurs électroniques
An image generation system enables user input during the process of training a generative model to influence the model's ability to generate new images with desired visual features. A source generative model for a source domain is fine-tuned using training images in a target domain to provide an adapted generative model for the target domain. Interpretable factors are determined for the source generative model and the adapted generative model. A user interface is provided that enables a user to select one or more interpretable factors. The user-selected interpretable factor(s) are used to generate a user-adapted generative model, for instance, by using a loss function based on the user-selected interpretable factor(s). The user-adapted generative model can be used to create new images in the target domain.
G06V 10/774 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source méthodes de Bootstrap, p.ex. "bagging” ou “boosting”
G06F 3/04842 - Sélection des objets affichés ou des éléments de texte affichés
38.
CROPPING FOR EFFICIENT THREE-DIMENSIONAL DIGITAL RENDERING
A method for generating a volume for three-dimensional rendering extracts a plurality of images from a source image input, normalizes the extracted images to have a common pixel size, and determines a notional camera placement for each normalized image to obtain a plurality of annotated normalized images, each annotated with a respective point of view through the view frustum of the notional camera. From the annotated normalized images, the method generates a first volume encompassing a first three-dimensional representation of the target object and selects a smaller subspace within the first volume that encompasses the first three-dimensional representation of the target object. The method generates, from the annotated normalized images, a second volume overlapping the first volume, encompassing a second three-dimensional representation of the target object and having a plurality of voxels, and crops the second volume to limit the second volume to the subspace.
This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that generate a temporally remapped video that satisfies a desired target duration while preserving natural video dynamics. In certain instances, the disclosed systems utilize a playback speed prediction machine-learning model that recognizes and localizes temporally varying changes in video playback speed to re-time a digital video with varying frame-change speeds. For instance, to re-time the digital video, the disclosed systems utilize the playback speed prediction machine-learning model to infer the slowness of individual video frames. Subsequently, in certain embodiments, the disclosed systems determine, from frames of a digital video, a temporal frame sub-sampling that is consistent with the slowness predictions and fit within a target video duration. In certain implementations, the disclosed systems utilize the temporal frame sub-sampling to generate a speed varying digital video that preserves natural video dynamics while fitting the target video duration.
H04N 21/2343 - Traitement de flux vidéo élémentaires, p.ex. raccordement de flux vidéo ou transformation de graphes de scènes MPEG-4 impliquant des opérations de reformatage de signaux vidéo pour la distribution ou la mise en conformité avec les requêtes des utilisateurs finaux ou les exigences des dispositifs des utilisateurs finaux
H04N 21/234 - Traitement de flux vidéo élémentaires, p.ex. raccordement de flux vidéo ou transformation de graphes de scènes MPEG-4
H04N 21/24 - Surveillance de procédés ou de ressources, p.ex. surveillance de la charge du serveur, de la bande passante disponible ou des requêtes effectuées sur la voie montante
Systems and methods for image processing are described. Embodiments of the present disclosure receive an image depicting an object; generate a sequence of tokens including a set of tokens corresponding to the object and a set of mask tokens corresponding to an additional object to be inserted into the image; generate a placement token value for the set of mask tokens based on the sequence of tokens using a sequence encoder, wherein the placement token value represents position information of the additional object; and insert the additional object into the image based on the position information to obtain a composite image.
G06T 11/60 - Edition de figures et de texte; Combinaison de figures ou de texte
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p.ex. des objets vidéo
G06V 20/70 - RECONNAISSANCE OU COMPRÉHENSION D’IMAGES OU DE VIDÉOS Éléments spécifiques à la scène Étiquetage du contenu de scène, p.ex. en tirant des représentations syntaxiques ou sémantiques
G06V 10/774 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source méthodes de Bootstrap, p.ex. "bagging” ou “boosting”
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
41.
JOINTLY PREDICTING MULTIPLE INDIVIDUAL-LEVEL FEATURES FROM AGGREGATE DATA
An analytics system jointly predicts values for multiple unobserved individual-level features using aggregate data for those features. Given a dataset, a transformation is applied to individual-level information for the dataset to generate transformed data in a higher dimensional space. Bag-wise mean embeddings are generated using the transformed data. The bag-wise mean embeddings and aggregate data for unobserved individual-level features for the dataset are used to train a model to jointly predict values for the unobserved individual-features for data instances. In particular, a given data instance can be transformed to a representation in a higher dimensional space. Given this representation, the trained model predicts values for the unobserved individual-level features for the data instance, and the data instance can be augmented with the predicted values.
Methods and systems are provided for facilitating generation of fillable document templates. In embodiments, a document having a plurality of tokens is obtained. Using a machine learned model, a token state is identified for each token of the plurality of tokens. Each token state indicates whether a corresponding token is a static token that is to be included in a fillable document template or a dynamic token that is to be excluded in the fillable document template. Thereafter, a fillable document template corresponding with the document is generated, wherein for each dynamic token of the document, the fillable document template includes a fillable field corresponding to the respective dynamic token.
Certain aspects and features of this disclosure relate to modeling shapes using differentiable, signed distance functions. 3D modeling software can edit a 3D model represented using the differentiable, signed distance functions while displaying the model in a manner that is computing resource efficient and fast. Further, such 3D modeling software can automatically create such an editable 3D model from a reference representation that can be obtained in various ways and stored in a variety of formats. For example, a real-world object can be scanned using LiDAR and a reference representation can be produced from the LiDAR data. Candidate procedural models from a library of curated procedural models are optimized to obtain the best procedural model for editing. A selected procedural model provides an editable, reconstructed shape based on the reference representation of the object.
G06F 30/12 - CAO géométrique caractérisée par des moyens d’entrée spécialement adaptés à la CAO, p.ex. interfaces utilisateur graphiques [UIG] spécialement adaptées à la CAO
G06T 19/20 - Transformation de modèles ou d'images tridimensionnels [3D] pour infographie Édition d'images tridimensionnelles [3D], p.ex. modification de formes ou de couleurs, alignement d'objets ou positionnements de parties
Techniques and systems are provided for glyph-aware text selection. For instance, a glyph selection system can detect that a user has selected a glyph within a user interface. The glyph selection system can highlight the glyph and/or a region encompassing the glyph to communicate, to the user, that the glyph is selected. This highlighted region can be determined based on the shape and/or outline of the glyph. For example, the glyph selection system can determine bounds (e.g. coordinates) of the glyph in order to highlight a region within the user interface that fully encompasses the glyph and does not include portions of unselected glyphs. In some cases, the highlighted region may be rectangular. Alternatively, the highlighted region may be non-rectangular, such as a border defined by the outline of the glyph.
G06F 3/0481 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] fondées sur des propriétés spécifiques de l’objet d’interaction affiché ou sur un environnement basé sur les métaphores, p.ex. interaction avec des éléments du bureau telles les fenêtres ou les icônes, ou avec l’aide d’un curseur changeant de comport
45.
SYSTEMS AND METHODS FOR IMAGE PROCESSING USING NATURAL LANGUAGE
Embodiments of the disclosure provide a machine learning model for generating a predicted executable command for an image. The learning model includes an interface configured to obtain an utterance indicating a request associated with the image, an utterance sub-model, a visual sub-model, an attention network, and a selection gate. The machine learning model generates a segment of the predicted executable command from weighted probabilities of each candidate token in a predetermined vocabulary determined based on the visual features, the concept features, current command features, and the utterance features extracted from the utterance or the image.
G06V 10/86 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant des correspondances graphiques
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/80 - Fusion, c. à d. combinaison des données de diverses sources au niveau du capteur, du prétraitement, de l’extraction des caractéristiques ou de la classification
G06V 10/77 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source
Systems and methods for diversity auditing are described. The systems and methods include identifying a plurality of images; detecting a face in each of the plurality of images using a face detection network; classifying the face in each of the plurality of images based on a sensitive attribute using an image classification network; generating a distribution of the sensitive attribute in the plurality of images based on the classification; and computing a diversity score for the plurality of images based on the distribution.
The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize an iterative neural network framework for generating artistic visual content. For instance, in one or more embodiments, the disclosed systems receive style parameters in the form a style image and/or a text prompt. In some cases, the disclosed systems further receive a content image having content to include in the artistic visual content. Accordingly, in one or more embodiments, the disclosed systems utilize a neural network to generate the artistic visual content by iteratively generating an image, comparing the image to the style parameters, and updating parameters for generating the next image based on the comparison. In some instances, the disclosed systems incorporate a superzoom network into the neural network for increasing the resolution of the final image and adding art details that are associated with a physical art medium (e.g., brush strokes).
A cluster generation system identifies data elements, from a first binary record, that each have a particular value and correspond to respective binary traits. A candidate description function describing the binary traits is generated, the candidate description function including a model factor that describes the data elements. Responsive to determining that a second record has additional data elements having the particular value and corresponding to the respective binary traits, the candidate description function is modified to indicate that the model factor describes the additional elements. The candidate description function is also modified to include a correction factor describing an additional binary trait excluded from the respective binary traits. Based on the modified candidate description function, the cluster generation system generates a data summary cluster, which includes a compact representation of the binary traits of the data elements and additional data elements.
In implementations of systems for generating images for virtual try-on and pose transfer, a computing device implements a generator system to receive input data describing a first digital image that depicts a person in a pose and a second digital image that depicts a garment. Candidate appearance flow maps are computed that warp the garment based on the pose at different pixel-block sizes using a first machine learning model. The generator system generates a warped garment image by combining the candidate appearance flow maps as an aggregate per-pixel displacement map using a convolutional gated recurrent network. A conditional segment mask is predicted that segments portions of a geometry of the person using a second machine learning model. The generator system outputs a digital image that depicts the person in the pose wearing the garment based on the warped garment image and the conditional segmentation mask using a third machine learning model.
Techniques described herein extract form structures from a static form to facilitate making that static form reflowable. A method described herein includes accessing low-level form elements extracted from a static form. The method includes determining, using a first set of prediction models, second-level form elements based on the low-level form elements. Each second-level form element includes a respective one or more low-level form elements. The method further includes determining, using a second set of prediction models, high-level form elements based on the second-level form elements and the low-level form elements. Each high-level form element includes a respective one or more second-level form elements or low-level form elements. The method further includes generating a reflowable form based on the static form by, for each high-level form element, linking together the respective one or more second-level form elements or low-level form elements.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06N 20/10 - Apprentissage automatique utilisant des méthodes à noyaux, p.ex. séparateurs à vaste marge [SVM]
51.
Multi-Modal Machine-Learning Model Training for Search
Multi-modal machine-learning model training techniques for search are described that overcome conventional challenges and inefficiencies to support real time output, which is not possible in conventional training techniques. In one example, a search system is configured to support multi-modal machine-learning model training. This includes use of a preview mode and an expanded mode. In the preview mode, a preview segment is generated as part of real time training of a machine learning model. In the expanded mode, the preview segment is persisted as an expanded segment that is used to train and utilize an expanded machine-learning model as part of search.
Embodiments of the present invention provide systems, methods, and computer storage media for generating and recommending responsive visualizations. In an example embodiment, a design specification of a source visualization and an author’s preferences are used to identify and rank compatible sets of candidate responsive transformations (e.g., using answer set programming). Each set is evaluated and ranked according to one or more cost metrics that quantify changes in information density, messaging, or popularity. Some embodiments generate a transformation specification in a declarative grammar that represent the sets of candidate responsive transformations independent of the structure of the source visualization specifications, compile each declarative transformation specification into a rendering grammar specification, and generate a responsive visualization by compiling the rendering grammar specification using a rendering grammar compiler. In some embodiments, the highest ranked responsive visualizations are presented as authoring recommendations and/or the highest ranked responsive visualization is automatically selected and applied.
Techniques for responsive video canvas generation are described to impart three-dimensional effects based on scene geometry to two-dimensional digital objects in a two-dimensional design environment. A responsive video canvas, for instance, is generated from input data including a digital video and scene data. The scene data describes a three-dimensional representation of an environment and includes a plurality of planes. A visual transform is generated and associated with each plane to enable digital objects to interact with the underlying scene geometry. In the responsive video canvas, an edit positioning a two-dimensional digital object with respect to a particular plane of the responsive video canvas is received. A visual transform associated with the particular plane is applied to the digital object and is operable to align the digital object to the depth and orientation of the particular plane. Accordingly, the digital object includes visual features based on the three-dimensional representation.
G06T 19/20 - Transformation de modèles ou d'images tridimensionnels [3D] pour infographie Édition d'images tridimensionnelles [3D], p.ex. modification de formes ou de couleurs, alignement d'objets ou positionnements de parties
G06T 17/20 - Description filaire, p.ex. polygonalisation ou tessellation
Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing one or more neural networks to recursively subdivide a three-dimensional mesh according to local geometries of vertices in the three-dimensional mesh. For example, the disclosed system can determine a local geometry (e.g., a one-ring neighborhood of half-flaps) for each vertex in a three-dimensional mesh. For each subdivision iteration, the disclosed system can then utilize a neural network to determine displacement coordinates for existing vertices in the three-dimensional mesh and coordinates for new vertices added to edges between the existing vertices in the three-dimensional mesh in accordance with the local geometries of the existing vertices. Furthermore, the disclosed system can generate a subdivided three-dimensional mesh based on the determined displacement coordinates for the existing vertices and the determined coordinates for the new vertices.
The present disclosure describes systems, non-transitory computer-readable media, and methods for utilizing a machine learning model trained to determine subtle pose differentiations to analyze a repository of captured digital images of a particular user to automatically capture digital images portraying the user. For example, the disclosed systems can utilize a convolutional neural network to determine a pose/facial expression similarity metric between a sample digital image from a camera viewfinder stream of a client device and one or more previously captured digital images portraying the user. The disclosed systems can determine that the similarity metric satisfies a similarity threshold, and automatically capture a digital image utilizing a camera device of the client device. Thus, the disclosed systems can automatically and efficiently capture digital images, such as selfies, that accurately match previous digital images portraying a variety of unique facial expressions specific to individual users.
The disclosure describes one or more implementations of a neural network architecture pruning system that automatically and progressively prunes neural networks. For instance, the neural network architecture pruning system can automatically reduce the size of an untrained or previously-trained neural network without reducing the accuracy of the neural network. For example, the neural network architecture pruning system jointly trains portions of a neural network while progressively pruning redundant subsets of the neural network at each training iteration. In many instances, the neural network architecture pruning system increases the accuracy of the neural network by progressively removing excess or redundant portions (e.g., channels or layers) of the neural network. Further, by removing portions of a neural network, the neural network architecture pruning system can increase the efficiency of the neural network.
Methods, systems, and non-transitory computer readable media are disclosed for generating artistic images by applying an artistic-effect to one or more frames of a video stream or digital images. In one or more embodiments, the disclosed system captures a video stream utilizing a camera of a computing device. The disclosed system deploys a distilled artistic-effect neural network on the computing device to generate an artistic version of the captured video stream at a first resolution in real time. The disclosed system can provide the artistic video stream for display via the computing device. Based on an indication of a capture event, the disclosed system utilizes the distilled artistic-effect neural network to generate an artistic image at a higher resolution than the artistic video stream. Furthermore, the disclosed system tunes and utilizes an artistic-effect patch generative adversarial neural network to modify parameters for the distilled artistic-effect neural network.
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/40 - Extraction de caractéristiques d’images ou de vidéos
G06V 10/56 - Extraction de caractéristiques d’images ou de vidéos relative à la couleur
H04N 23/63 - Commande des caméras ou des modules de caméras en utilisant des viseurs électroniques
58.
ENHANCING LIGHT TEXT IN SCANNED DOCUMENTS WHILE PRESERVING DOCUMENT FIDELITY
The present disclosure relates to systems, non-transitory computer-readable media, and methods that implement an image filter for enhancing light text and removing document shadows. In particular embodiments, the disclosed systems use a modified adaptive thresholding approach the relies on image gradients to efficiently guide the thresholding process. In addition, the disclosed systems use a machine-learning model to generate a document shadow map. The document shadow map can include text reflections. Accordingly, the disclosed systems remove text reflections from the document shadow map (e.g., by using an interpolated shadow intensity value of neighboring shadow map pixels). In turn, the disclosed systems use the document text mask and the document shadow map cleaned of text reflections to remove shadows from the digital image. Further, the disclosed systems enhance text in the shadow-removed digital image based on contrast stretching.
Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing unsupervised learning of discrete human motions to generate digital human motion sequences. The disclosed system utilizes an encoder of a discretized motion model to extract a sequence of latent feature representations from a human motion sequence in an unlabeled digital scene. The disclosed system also determines sampling probabilities from the sequence of latent feature representations in connection with a codebook of discretized feature representations associated with human motions. The disclosed system converts the sequence of latent feature representations into a sequence of discretized feature representations by sampling from the codebook based on the sampling probabilities. Additionally, the disclosed system utilizes a decoder to reconstruct a human motion sequence from the sequence of discretized feature representations. The disclosed system also utilizes a reconstruction loss and a distribution loss to learn parameters of the discretized motion model.
In implementations of text wrap modification using variable inset, a display screen of a device displays lines of text wrapped to an inset space maintained between an object boundary and the lines of text. The device implements a text wrap modification module to determine that a penalty value associated with a line of text is reduced if the line of text is extended to include one or more words from a subsequent line of text, determine that the one or more words fit within an additional space for the line of text based on a variable overlap of the line of text into the inset space, and display the one or more words from the line of text as extended to include the one or more words from the subsequent line of text.
In implementations of repeat object blending, a computing device implements a repeat object blending system, which is implemented to receive a digital image depicting a first object and a second object, where the first object is depicted as multiple instances of a repeated base object, and the second object is depicted as multiple instances of a visually different repeated base object. The repeat object blending system can identify visual characteristics of the first object and the second object. The repeat object blending system can then generate an intermediate object by blending one or more of the visual characteristics of the first object and one or more of the visual characteristics of the second object. The resulting intermediate object is a visual representation of the repeated base object blended with the visually different repeated base object.
Methods and systems are provided for facilitating identification of sensitive content. In embodiments described herein, a set of sensitive topics is obtained. Each sensitive topic in the set of sensitive topics can include subject matter that may be deemed sensitive to one or more individuals. Thereafter, the set of sensitive topics is expanded to an expanded set of sensitive topics using a first machine learning model. The expanded set of sensitive topics is used to train a second machine learning model to predict potential sensitive content in relation to input content.
A control system facilitates active management of a streaming data system. Given historical data traffic for each data stream processed by a streaming data system, the control system uses a machine learning model to predict future data traffic for each data stream. The control system selects a matching between data streams and servers for a future time that minimizes a cost comprising a switching cost and a server imbalance cost based on the predicted data traffic for the future time. In some configurations, the matching is selected using a planning window comprising a number of future time steps dynamically selected based on uncertainty associated with the predicted data traffic. Given the selected matching, the control system may manage the streaming data system by causing data streams to be moved between servers based on the matching.
In implementations of systems for cloud-based resource allocation using meters, a computing device implements a resource system to receive resource data describing an amount of cloud-based resources reserved for consumption by client devices during a period of time and a total amount of cloud-based resources consumed by the client devices during the period of time. The resource system determines a consumption distribution using each meter included in a set of meters. Each of the consumption distributions allocates a portion of the total amount of the cloud-based resources consumed to each client device of the client devices. A particular meter used to determine a particular consumption distribution is selected based on a Kendall Tau coefficient of the particular consumption distribution. An amount of cloud-based resources to allocate for a future period of time is estimated using the particular meter and an approximate Shapley value.
Methods, systems, and non-transitory computer readable storage media are disclosed for utilizing offline models to warm start online bandit learner models. For example, the disclosed system can determine relevant offline models for an environment based on reward estimate differences between the offline models and the online model. The disclosed system can then utilize the relevant offline models (if any) to select an arm for the environment. The disclosed system can update the online model based on observed rewards for the selected arm. Additionally, the disclosed system can also use entropy reduction of arms to determine the utility of the arms in differentiating relevant and irrelevant offline models. For example, the disclosed system can select an arm based on a combination of the entropy reduction of the arm and the reward estimate for the arm and use the observed reward to update an observation history.
G06N 5/04 - Modèles d’inférence ou de raisonnement
G06F 18/21 - Conception ou mise en place de systèmes ou de techniques; Extraction de caractéristiques dans l'espace des caractéristiques; Séparation aveugle de sources
Systems and methods for key-phrase extraction are described. The systems and methods include receiving a transcript including a text paragraph and generating key-phrase data for the text paragraph using a key-phrase extraction network. The key-phrase extraction network is trained to identify domain-relevant key-phrase data based on domain data obtained using a domain discriminator network. The systems and methods further include generating meta-data for the transcript based on the key-phrase data.
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p.ex. dialogue homme-machine
G10L 15/06 - Création de gabarits de référence; Entraînement des systèmes de reconnaissance de la parole, p.ex. adaptation aux caractéristiques de la voix du locuteur
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels
67.
GENERATING COLLAGE DIGITAL IMAGES BY COMBINING SCENE LAYOUTS AND PIXEL COLORS UTILIZING GENERATIVE NEURAL NETWORKS
The present disclosure relates to systems, methods, and non-transitory computer readable media for generating digital images depicting photorealistic scenes utilizing a digital image collaging neural network. For example, the disclosed systems utilize a digital image collaging neural network having a particular architecture for disentangling generation of scene layouts and pixel colors for different regions of a digital image. In some cases, the disclosed systems break down the process of generating a collage digital into generating images representing different regions such as a background and a foreground to be collaged into a final result. For example, utilizing the digital image collaging neural network, the disclosed systems determine scene layouts and pixel colors for both foreground digital images and background digital images to ultimately collage the foreground and background together into a collage digital image depicting a real-world scene.
The present disclosure relates to systems, methods, and non-transitory computer readable media for training a generative inpainting neural network to accurately generate inpainted digital images via object-aware training and/or masked regularization. For example, the disclosed systems utilize an object-aware training technique to learn parameters for a generative inpainting neural network based on masking individual object instances depicted within sample digital images of a training dataset. In some embodiments, the disclosed systems also (or alternatively) utilize a masked regularization technique as part of training to prevent overfitting by penalizing a discriminator neural network utilizing a regularization term that is based on an object mask. In certain cases, the disclosed systems further generate an inpainted digital image utilizing a trained generative inpainting model with parameters learned via the object-aware training and/or the masked regularization
Systems and methods for image generation are described. Embodiments of the present disclosure receive a text phrase that describes a target image to be generated; generate text features based on the text phrase; retrieve a search image based on the text phrase; and generate the target image using an image generation network based on the text features and the search image.
Systems and methods for image processing are described. The systems and methods include receiving a plurality of frames of a video at an edge device, wherein the video depicts an action that spans the plurality of frames, compressing, using an encoder network, each of the plurality of frames to obtain compressed frame features, wherein the compressed frame features include fewer data bits than the plurality of frames of the video, classifying, using a classification network, the compressed frame features at the edge device to obtain action classification information corresponding to the action in the video, and transmitting the action classification information from the edge device to a central server.
H04N 19/176 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’unité de codage, c. à d. la partie structurelle ou sémantique du signal vidéo étant l’objet ou le sujet du codage adaptatif l’unité étant une zone de l'image, p.ex. un objet la zone étant un bloc, p.ex. un macrobloc
H04N 19/61 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant un codage par transformée combiné avec un codage prédictif
H04N 19/172 - Procédés ou dispositions pour le codage, le décodage, la compression ou la décompression de signaux vidéo numériques utilisant le codage adaptatif caractérisés par l’unité de codage, c. à d. la partie structurelle ou sémantique du signal vidéo étant l’objet ou le sujet du codage adaptatif l’unité étant une zone de l'image, p.ex. un objet la zone étant une image, une trame ou un champ
71.
TRAINING A MODEL FOR PERFORMING ABSTRACTIVE TEXT SUMMARIZATION
Techniques for training for and performing abstractive text summarization are disclosed. Such techniques include, in some embodiments, obtaining textual content, and generating a reconstruction of the textual content using a trained language model, the reconstructed textual content comprising an abstractive summary of the textual content generated based on relative importance parameters associated with respective portions of the textual content. In some cases, the trained language model includes a neural network language model that has been trained by identifying a plurality of discrete portions of training textual content, receiving the plurality of discrete portions of the training textual content as input to the language model, and predicting relative importance parameters associated with respective ones of the plurality of discrete portions of the training textual content, the relative importance parameters each being based at least on one or more linguistic similarity measures with respect to a ground truth.
Methods, systems, and non-transitory computer readable media are disclosed for intelligently generating partially textured accessible images. In one or more embodiments, the disclosed systems generate or access a color texture map specific to a given color vision deficiency. For example, the disclosed systems generate a color texture map that divides a color space into one or more textured segment of colors and one or more complementary untextured segments of colors. The disclosed systems can utilize the color texture map to intelligently apply different textures to subsets of pixels that contribute to color vision deficiency confusion. For example, the disclosed system maps pixels from a color image to the color texture map to identify a first subset of pixels corresponding to the textured segment of colors. The disclosed system generates a partially textured accessible image by applying a texture to the first subset of pixels.
Techniques for training a language model for code switching content are disclosed. Such techniques include, in some embodiments, generating a dataset, which includes identifying one or more portions within textual content in a first language, the identified one or more portions each including one or more of offensive content or non-offensive content; translating the identified one or more salient portions to a second language; and reintegrating the translated one or more portions into the textual content to generate code-switched textual content. In some cases, the textual content in the first language includes offensive content and non-offensive content, the identified one or more portions include the offensive content, and the translated one or more portions include a translated version of the offensive content. In some embodiments, the code-switched textual content is at least part of a synthetic dataset usable to train a language model, such as a multilingual classification model.
G06F 40/58 - Utilisation de traduction automatisée, p.ex. pour recherches multilingues, pour fournir aux dispositifs clients une traduction effectuée par le serveur ou pour la traduction en temps réel
G06F 40/47 - Traduction assistée par ordinateur, p.ex. utilisant des mémoires de traduction
An analytics system identifies interventions for individual samples from a set of samples with a mixture of interventions. Given a causal graph, a set of baseline samples, and a set of samples with interventions, a set of intervention tuples is determined that represents the mixture of interventions for the set of samples with interventions. Each intervention tuple in the set of intervention tuples identifies an intervention and a mixing coefficient representing a percentage of samples with the intervention. An iterative process is used in which a set of intervention tuples is determined for N variables and then lifted to a set of intervention tuples for N+1 variables until all variables from the causal graph have been considered, providing a final set of intervention tuples. The final set of intervention tuples is used to match individual samples from the set of samples with interventions to interventions.
Keyword localization digital image search techniques are described. These techniques support an ability to indicate “where” a corresponding keyword is to be expressed with respect to a layout in a respective digital image resulting from a search query. The search query may also include an indication of a size of the keyword as expressed in the digital image, a number of instances of the keyword, and so forth. Additionally, the techniques and systems as described herein support real time search through use of keyword signatures.
G06F 16/583 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement utilisant des métadonnées provenant automatiquement du contenu
G06F 16/51 - Indexation; Structures de données à cet effet; Structures de stockage
G06F 16/54 - Navigation; Visualisation à cet effet
76.
VISUAL SPEECH RECOGNITION FOR DIGITAL VIDEOS UTILIZING GENERATIVE ADVERSARIAL LEARNING
This disclosure describes one or more implementations of systems, non-transitory computer-readable media, and methods that recognize speech from a digital video utilizing an unsupervised machine learning model, such as a generative adversarial neural network (GAN) model. In one or more implementations, the disclosed systems utilize an image encoder to generate self-supervised deep visual speech representations from frames of an unlabeled (or unannotated) digital video. Subsequently, in one or more embodiments, the disclosed systems generate viseme sequences from the deep visual speech representations (e.g., via segmented visemic speech representations from clusters of the deep visual speech representations) utilizing the adversarially trained GAN model. Indeed, in some instances, the disclosed systems decode the viseme sequences belonging to the digital video to generate an electronic transcription and/or digital audio for the digital video.
G10L 15/25 - Reconnaissance de la parole utilisant des caractéristiques non acoustiques utilisant la position des lèvres, le mouvement des lèvres ou l’analyse du visage
G06K 9/62 - Méthodes ou dispositions pour la reconnaissance utilisant des moyens électroniques
G06V 20/40 - RECONNAISSANCE OU COMPRÉHENSION D’IMAGES OU DE VIDÉOS Éléments spécifiques à la scène dans le contenu vidéo
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G10L 15/16 - Classement ou recherche de la parole utilisant des réseaux neuronaux artificiels
G10L 13/02 - Procédés d'élaboration de parole synthétique; Synthétiseurs de parole
G10L 15/22 - Procédures utilisées pendant le processus de reconnaissance de la parole, p.ex. dialogue homme-machine
G10L 25/57 - Techniques d'analyses de la parole ou de la voix qui ne se limitent pas à un seul des groupes spécialement adaptées pour un usage particulier pour comparaison ou différentiation pour le traitement des signaux vidéo
Certain aspects and features of this disclosure relate to virtual 3D pointing and manipulation. For example, video communication is established between a presenter client device and a viewer client device. A presenter video image is captured. A 3D image of a 3D object is rendered on the client devices and a presenter avatar is rendered on at least the viewer client device. The presenter avatar includes at least a portion of the presenter video image. When a positional input is detected at the presenter client device, the system renders, on the viewer client device, an articulated virtual appurtenance associated with the positional input, the 3D image, and the presenter avatar. A virtual interaction between the articulated virtual appurtenance and the 3D image appear to a viewer as naturally positioned for the interaction with respect to the viewer.
G06T 19/20 - Transformation de modèles ou d'images tridimensionnels [3D] pour infographie Édition d'images tridimensionnelles [3D], p.ex. modification de formes ou de couleurs, alignement d'objets ou positionnements de parties
G06T 7/73 - Détermination de la position ou de l'orientation des objets ou des caméras utilisant des procédés basés sur les caractéristiques
G06F 3/04815 - Interaction s’effectuant dans un environnement basé sur des métaphores ou des objets avec un affichage tridimensionnel, p.ex. modification du point de vue de l’utilisateur par rapport à l’environnement ou l’objet
Systems and methods for natural language processing are described. Embodiments of the present disclosure receive an input phrase including an aspect term; generate a complement phrase based on the input phrase using a language generator model, wherein the complement phrase includes different words than the input phrase; combine a representation of the input phrase and a representation of the complement phrase to obtain an augmented representation of the input phrase; and generate sentiment information corresponding to the aspect term based on the augmented representation.
Systems and methods for image processing are described. Embodiments of the present disclosure receive a training image and a caption for the training image, wherein the caption includes text describing an object in the training image; generate a pseudo mask for the object using a teacher network based on the text describing the object; generate a mask for the object using a student network; and update parameters of the student network based on the mask and the pseudo mask.
G06V 10/778 - Apprentissage de profils actif, p.ex. apprentissage en ligne des caractéristiques d’images ou de vidéos
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 10/86 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant des correspondances graphiques
G06V 10/22 - Prétraitement de l’image par la sélection d’une région spécifique contenant ou référençant une forme; Localisation ou traitement de régions spécifiques visant à guider la détection ou la reconnaissance
G06V 10/77 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p.ex. des objets vidéo
G06V 10/776 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant l’intégration et la réduction de données, p.ex. analyse en composantes principales [PCA] ou analyse en composantes indépendantes [ ICA] ou cartes auto-organisatrices [SOM]; Séparation aveugle de source Évaluation des performances
The present technology provides for facilitating removal of undesired search results. In one embodiment, a search request including a search term(s) to use for performing a search is obtained. Thereafter, a search query is generated to execute the search. The search query includes the search terms and a removal parameter indicating a particular search result to exclude from search results returned in response to the search request. A set of search results are provided for display via a user device. Such a set of search results can be identified based on execution of the search query and exclude the particular search result.
A data augmentation framework enhances the prediction accuracy of tensor completion methods. An array having a set of cells associated with a set of entities is received. Influence metrics of cells from the array are determined based on an influence of the cells on minimizing loss while training a machine learning model. An entity-importance metric is generated for each entity of the set of entities based on the influence metrics. A cell from the array for which to augment the array with a predicted value is identified. The cell is identified based on a sampling of the set of entities that is weighted by the entity-importance metric for each entity of the set of entities.
In implementations of systems for generating and applying editing presets, a computing device implements a preset system to detect objects depicted in a digital image that is displayed in a user interface of an application for editing digital content. Input data is received describing an edited region of the digital image and properties of an editing operation performed in the edited region. The preset system identifies a particular detected object of the detected objects based on a bounding box of the particular detected object and an area of the edited region. An additional digital image is edited by applying the properties of the editing operation to a detected object that is depicted in the additional digital image based on a classification of the detected object and a classification of the particular detected object.
G06F 3/04845 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] pour la commande de fonctions ou d’opérations spécifiques, p.ex. sélection ou transformation d’un objet, d’une image ou d’un élément de texte affiché, détermination d’une valeur de paramètre ou sélection d’une plage de valeurs pour la transformation d’images, p.ex. glissement, rotation, agrandissement ou changement de couleur
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p.ex. des objets vidéo
G06F 3/0482 - Interaction avec des listes d’éléments sélectionnables, p.ex. des menus
Embodiments of the present invention provide systems, methods, and non-transitory computer storage media for generating an ambient occlusion (AO) map for a 2D image that can be combined with the 2D image to adjust the contrast of the 2D image based on the geometric information in the 2D image. In embodiments, using a trained neural network, an AO map for a 2D image is automatically generated without any predefined 3D scene information. Optimizing the neural network to generate an estimated AO map for a 2D image requires training, testing, and validating the neural network using a synthetic dataset comprised of pairs of images and ground truth AO maps rendered from 3D scenes. By using an estimated AO map to adjust the contrast of a 2D image, the contrast of the image can be adjusted to make the image appear lifelike by modifying the shadows and shading in the image based on the ambient lighting present in the image.
Techniques for content-aware font recommendations include obtaining an electronic document comprising an image and text. The image is processed using one or more convolutional neural networks to determine one or more image tags. The image tags are mapped to one or more font tags using a user map, a designer map, or one or more contextual synonyms of the image tags. A font to recommend for the electronic document is then determined using the one or more font tags.
The present disclosure relates to systems, methods, and non-transitory computer readable media for accurately and flexibly generating modified digital images utilizing a novel swapping autoencoder that incorporates scene layout. In particular, the disclosed systems can receive a scene layout map that indicates or defines locations for displaying specific digital content within a digital image. In addition, the disclosed systems can utilize the scene layout map to guide combining portions of digital image latent code to generate a modified digital image with a particular textural appearance and a particular geometric structure defined by the scene layout map. Additionally, the disclosed systems can utilize a scene layout map that defines a portion of a digital image to modify by, for instance, adding new digital content to the digital image, and can generate a modified digital image depicting the new digital content.
This disclosure describes one or more implementations of a digital image semantic layout manipulation system that generates refined digital images resembling the style of one or more input images while following the structure of an edited semantic layout. For example, in various implementations, the digital image semantic layout manipulation system builds and utilizes a sparse attention warped image neural network to generate high-resolution warped images and a digital image layout neural network to enhance and refine the high-resolution warped digital image into a realistic and accurate refined digital image.
G06V 10/46 - Descripteurs pour la forme, descripteurs liés au contour ou aux points, p.ex. transformation de caractéristiques visuelles invariante à l’échelle [SIFT] ou sacs de mots [BoW]; Caractéristiques régionales saillantes
G06V 30/24 - Reconnaissance de caractères caractérisée par la méthode de traitement ou de reconnaissance
G06F 18/213 - Extraction de caractéristiques, p.ex. en transformant l'espace des caractéristiques; Synthétisations; Mappages, p.ex. procédés de sous-espace
Methods, systems, and non-transitory computer readable media are disclosed for intelligently generating comprehensive and relevant visual data stories from tabular data. The disclosed system can create data stories from a given dataset guided by some user selections to set a theme for the story. In particular, the disclosed system evaluates facts of a dataset to identify significant facts. The disclosed system can order significant facts to create a backbone for the data story based on user selections by utilizing a beam search. Furthermore, the disclosed system can expand the backbone of the data story by generating coherence facts. The disclosed system may present the generated story via a graphical user interface.
Vector object transformation techniques are described that support generation of a transformed vector object based on a first vector object and a second vector object. A plurality of paths for a first and second vector object, for instance, are generated. Corresponding paths are determined by detecting which of the plurality of paths from the first vector object correspond to which of the plurality of paths from the second vector object. A mapping of control points between the first and second vector objects is generated. Using the mapping, a transformation of the first vector object is generated by adjusting one or more control points of the first vector object. As a result, the transformed vector object includes visual characteristics based on both the first vector object and the second vector object.
Techniques and systems are described for generating location based photo discovery suggestions. Generally, photo discovery data is generated and utilized to form discovery suggestions that identify suggested locations for capturing photographs, as well as other capture-related information that is presented to assist a user in capturing photographs of interest to the user. A discovery suggestion, for example, not only identifies a location of potential photographic interest to a user, but also includes information such as camera settings and suggested camera equipment for capturing photos at the location. A discovery suggestion thus guides a user as to how to maximize a likelihood that a digital image captured by the user includes subject matter of interest to the user and is also visually pleasing.
The present disclosure relates to a digital asset synchronization system that provides improved digital asset management and synchronization of a digital asset stored either within a component database or a packaged file. For example, the digital asset synchronization system enables a set of components that makes up a digital asset to appear as a singular packaged file, while also maintaining the benefits of having the digital asset made up of the components. In this manner, the digital asset synchronization system provides a bridge between a digital asset stored in a packaged file format and conventional file formats. In addition, the digital asset synchronization system also provides digital asset management and improved synchronization between a client device and a cloud storage system.
H04L 67/1095 - Réplication ou mise en miroir des données, p.ex. l’ordonnancement ou le transport pour la synchronisation des données entre les nœuds du réseau
91.
ORGANIZING A GRAPHIC DESIGN DOCUMENT USING SEMANTIC LAYERS
Embodiments are disclosed for semantically organizing a graphic design document. A method of semantically organizing a graphic design document can include obtaining a document, identifying a plurality of layers associated with the document, determining a plurality of semantic labels associated with the plurality of layers, determining a semantic layer hierarchy of the plurality of layers, and organizing the plurality of layers based at least on the semantic layer hierarchy.
G06F 3/0481 - Techniques d’interaction fondées sur les interfaces utilisateur graphiques [GUI] fondées sur des propriétés spécifiques de l’objet d’interaction affiché ou sur un environnement basé sur les métaphores, p.ex. interaction avec des éléments du bureau telles les fenêtres ou les icônes, ou avec l’aide d’un curseur changeant de comport
G06F 3/04842 - Sélection des objets affichés ou des éléments de texte affichés
92.
HIERARCHICAL IMAGE GENERATION VIA TRANSFORMER-BASED SEQUENTIAL PATCH SELECTION
Systems and methods for image processing are described. Embodiments of the present disclosure identify a first image depicting a first object; identify a plurality of candidate images depicting a second object; select a second image from the plurality of candidate images depicting the second object based on the second image and a sequence of previous images including the first image using a crop selection network trained to select a next compatible image based on the sequence of previous images; and generate a composite image depicting the first object and the second object based on the first image and the second image.
G06V 10/26 - Segmentation de formes dans le champ d’image; Découpage ou fusion d’éléments d’image visant à établir la région de motif, p.ex. techniques de regroupement; Détection d’occlusion
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06T 5/50 - Amélioration ou restauration d'image en utilisant plusieurs images, p.ex. moyenne, soustraction
93.
AUTOMATICALLY DETECTING USER-REQUESTED OBJECTS IN DIGITAL IMAGES
The present disclosure relates to an object selection system that accurately detects and optionally automatically selects user-requested objects (e.g., query objects) in digital images. For example, the object selection system builds and utilizes an object selection pipeline to determine which object detection neural network to utilize to detect a query object based on analyzing the object class of a query object. In particular, the object selection system can identify both known object classes as well as objects corresponding to unknown object classes.
G06F 18/2113 - Sélection du sous-ensemble de caractéristiques le plus significatif en classant ou en filtrant l'ensemble des caractéristiques, p.ex. en utilisant une mesure de la variance ou de la corrélation croisée des caractéristiques
G06V 10/764 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant la classification, p.ex. des objets vidéo
G06V 10/82 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique utilisant les réseaux neuronaux
G06V 20/70 - RECONNAISSANCE OU COMPRÉHENSION D’IMAGES OU DE VIDÉOS Éléments spécifiques à la scène Étiquetage du contenu de scène, p.ex. en tirant des représentations syntaxiques ou sémantiques
The present disclosure relates to systems, methods, and non-transitory computer readable media that utilize a continuous kernel neural network that learns continuous reconstruction kernels to merge digital image samples in local neighborhoods and generate enhanced digital images from a plurality of burst digital images. For example, the disclosed systems can utilize an alignment model to align image samples from burst digital images to a common coordinate system (e.g., without resampling). In some embodiments, the disclosed systems generate localized latent vector representations of kernel neighborhoods and determines continuous displacement vectors between the image samples and output pixels of the enhanced digital image. The disclosed systems can utilize the continuous kernel network together with the latent vector representations and continuous displacement vectors to generated learned kernel weights for combining the image samples and generating an enhanced digital image.
G06T 5/50 - Amélioration ou restauration d'image en utilisant plusieurs images, p.ex. moyenne, soustraction
G06T 7/33 - Détermination des paramètres de transformation pour l'alignement des images, c. à d. recalage des images utilisant des procédés basés sur les caractéristiques
Directional propagation editing techniques are described, in one example, a digital image, a depth map, and a direction are obtained by an image editing system. The image editing system then generates features. To do so, the image editing system generates features from the digital image and the depth map for each pixel based on the direction, e.g., until an edge of the digital image is reached. In an implementation, instead of storing a value of the depth directly, a ratio is stored based on a depth in the depth map and a depth of a point along the direction. The image editing system then forms a feature volume using the features, e.g., as three dimensionally stacked features. The feature volume is employed by the image editing system as part of editing the digital image to form an edited digital image.
An illustrator system accesses a multi-element document, the multi-element document including a plurality of elements. The illustrator system determines, for each of the plurality of elements, an element-specific topic distribution comprising a ranked list of topics. The illustrator system creates a first aggregated topic distribution from the determined element-specific topic distributions. The illustrator system determines a global intent for the multi-element document, the global intent including one or more terms from the first aggregated topic distribution. The illustrator system queries a database using the global intent to retrieve a substitute element. The illustrator system generates a replacement multi-element document that includes a substitute element in place of an element in the multi-element document The at least one substitute element is different from the element in the displayed multi-element document.
G06F 40/166 - Traitement de texte Édition, p.ex. insertion ou suppression
G06F 40/106 - Affichage de la mise en page des documents; Prévisualisation
G06V 30/413 - Classification de contenu, p.ex. de textes, de photographies ou de tableaux
G06F 16/58 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
G06F 16/38 - Recherche caractérisée par l’utilisation de métadonnées, p.ex. de métadonnées ne provenant pas du contenu ou de métadonnées générées manuellement
97.
AUTOMATICALLY GENERATING SEMANTIC LAYERS IN A GRAPHIC DESIGN DOCUMENT
Embodiments are disclosed for creating and managing semantic layers in a graphic design system. A method of creating and managing semantic layers includes receiving a selection of a content type to be generated, receiving a selection of a location in a digital canvas to place content of the content type, generating, using one or more machine learning models, content of the selected content type at the location in the digital canvas, and automatically adding the content to a layer associated with the digital canvas based on a semantic label associated with the content.
G06V 10/70 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique
Embodiments are disclosed for performing content linting in a graphic design system. A method of content linting includes receiving a selection of a content type to be generated, receiving a selection of a location in a digital canvas to place content of the content type, determining a placement context associated with the location in the digital canvas, identifying one or more content rules to the content based on a static analysis of the placement context, and generating, using one or more machine learning models, content of the selected content type at the location in the digital canvas using the one or more content rules.
G06T 11/60 - Edition de figures et de texte; Combinaison de figures ou de texte
G06V 10/70 - Dispositions pour la reconnaissance ou la compréhension d’images ou de vidéos utilisant la reconnaissance de formes ou l’apprentissage automatique
99.
VIDEO RECOMMENDER SYSTEM BY KNOWLEDGE BASED MULTI-MODAL GRAPH NEURAL NETWORKS
Systems and methods for item recommendation are described. Embodiments of the present disclosure receive input indicating a relationship between a user and a first content item; generate a knowledge graph based on the input, wherein the knowledge graph comprises relationship information between the user and a plurality of content items; generate a first feature embedding representing the user and a second feature embedding representing a second content item of the plurality of content items based on the knowledge graph, wherein the second feature embedding is generated using a first modality for a query vector of an attention mechanism and a second modality for a key vector and a value vector of the attention mechanism; compare the first feature embedding to the second feature embedding to obtain a similarity score; and recommend the second content item for the user based on the similarity score.