AI App Inventor

AI App Inventor — independent reviews, comparisons, pricing and step-by-step guides on Aizhi.

  • Bag-of-words model

    Bag-of-words model

    The bag-of-words (BoW) model is a model of text which uses an unordered collection (a "bag") of words. It is used in natural language processing and information retrieval (IR). It disregards word order (and thus most of syntax or grammar) but captures multiplicity. The bag-of-words model is commonly used in methods of document classification where, for example, the (frequency of) occurrence of each word is used as a feature for training a classifier. It has also been used for computer vision. An early reference to "bag of words" in a linguistic context can be found in Zellig Harris's 1954 article on Distributional Structure. == Definition == The following models a text document using bag-of-words. Here are two simple text documents: Based on these two text documents, a list is constructed as follows for each document: Representing each bag-of-words as a JSON object, and attributing to the respective JavaScript variable: Each key is the word, and each value is the number of occurrences of that word in the given text document. The order of elements is free, so, for example {"too":1,"Mary":1,"movies":2,"John":1,"watch":1,"likes":2,"to":1} is also equivalent to BoW1. It is also what we expect from a strict JSON object representation. Note: if another document is like a union of these two, its JavaScript representation will be: So, as we see in the bag algebra, the "union" of two documents in the bags-of-words representation is, formally, the disjoint union, summing the multiplicities of each element. === Word order === The BoW representation of a text removes all word ordering. For example, the BoW representation of "man bites dog" and "dog bites man" are the same, so any algorithm that operates with a BoW representation of text must treat them in the same way. Despite this lack of syntax or grammar, BoW representation is fast and may be sufficient for simple tasks that do not require word order. For instance, for document classification, if the words "stocks" "trade" "investors" appears multiple times, then the text is likely a financial report, even though it would be insufficient to distinguish between Yesterday, investors were rallying, but today, they are retreating.andYesterday, investors were retreating, but today, they are rallying.and so the BoW representation would be insufficient to determine the detailed meaning of the document. == Implementations == Implementations of the bag-of-words model might involve using frequencies of words in a document to represent its contents. The frequencies can be "normalized" by the inverse of document frequency, or tf–idf. Additionally, for the specific purpose of classification, supervised alternatives have been developed to account for the class label of a document. Lastly, binary (presence/absence or 1/0) weighting is used in place of frequencies for some problems (e.g., this option is implemented in the WEKA machine learning software system). == Hashing trick == A common alternative to using dictionaries is the hashing trick, where words are mapped directly to indices with a hash function. When using a hash function, no memory is required to store a dictionary. In practice, hashing simplifies the implementation of bag-of-words models and improves scalability. Collisions can occur when two words are hashed to the same index, but this happens infrequently and may function as a form of regularization.

    Read more →
  • How to Choose an AI Resume Builder

    How to Choose an AI Resume Builder

    Trying to pick the best AI resume builder? An AI resume builder is software that uses machine learning to help you get more done — it scales effortlessly from a single task to thousands. The best picks balance beginner-friendly simplicity with the depth power users need, and they ship updates often. Whether you are a beginner or a pro, the right AI resume builder slots into your workflow and pays for itself fast. This guide breaks down the top picks, their pros and cons, and who each one is best for.

    Read more →
  • Bruno Zamborlin

    Bruno Zamborlin

    Bruno Zamborlin (born 1983 in Vicenza) is an AI researcher, entrepreneur and artist based in London, working in the field of human-computer interaction. His work focuses on converting physical objects into touch-sensitive, interactive surfaces using vibration sensors and artificial intelligence. In 2013, he founded Mogees Limited a start-up to transform everyday objects into musical instruments and games using a vibration sensor and a mobile phone. With HyperSurfaces, he converts physical surfaces of any material, shape and form into data-enabled-interactive surfaces using a vibration sensor and a coin-sized chipset. As an artist, he has created art installations around the world, with his most recent work comprising a unique series of "sound furnitures" that was showcased at the Italian Pavilion of the Venice Biennale 2023. He regularly performed with UK-based electronic music duo Plaid (Warp Records). He is also honorary visiting research fellow at Goldsmiths, University of London. == Early life and education == From 2008-2011, Zamborlin worked at the IRCAM (Institute for Research and Coordination Acoustic Musical) – Centre Pompidou as a member of the Sound Music Movement Interaction team. Under the supervision of Frederic Bevilacqua, he started experimenting with the use of artificial intelligence and human movements, and contributed to the creation of Gesture Follower, a software used to analyse body movements of performers and dancers through motion sensors in order to control sound and visual media in real-time, slowing down or speeding up their reproduction based on the speed the gestures are performed. He has lived in London since 2011, where he developed a joint PhD between Goldsmiths, University of London and IRCAM - Centre Pompidou/Pierre and Marie Curie University Paris in AI, focussing on the concept of Interactive Machine Learning applied to digital musical instruments and performing arts. == Career == Zamborlin founded Mogees Limited in 2013 in London, with IRCAM being amongst the early partners. Mogees transform physical objects into musical instruments and games using a vibration sensor and a series of apps for smartphones and desktop. After a campaign on Kickstarter in 2014, Mogees was used both by common users and artists such as Rodrigo y Gabriela, Jean-Michel Jarre and Plaid. The algorithms implemented in these apps employ a special version of physical modelling sound synthesis, where the vibration produced by users when interacting with the physical object are used as exciter for a digital resonator which runs in the app. The result is a hybrid, half acoustic and half digital sound which is a function of both software and acoustic properties of the physical object the users decide to play. In 2017, Zamborlin founded HyperSurfaces together with computational artist Parag K Mital. to merge "the physical and the digital worlds". HyperSurfaces technology converts any surface made of any material, shape and size into data-enabled interactive objects, employing a vibration sensor and proprietary AI algorithms running on a coin-sized chipset. The vibrations generated by people's interactions on the surface are converted into an electric signal by a piezoelectric sensor and analysed in realtime by AI algorithms that run on the chipset. Anytime the AI recognises in the vibration signal one of the events that have been predefined by the user beforehand, a corresponding notification message is generated in realtime and sent to some application. The technology can be applied to anything ranging from button-less human-computer interaction applications for automotive and smart home to the Internet of things. Because the AI algorithms employed by HyperSurfaces run locally on a chipset, without the need to access cloud-based services, they are considered to be part of the field of edge computing. Also, because the AI can be trained beforehand to recognise the events its users are interested in, HyperSurfaces algorithms belong to the field of supervised machine learning. == Selected awards == IRISA Prix Jeune Chercheur, 13 October 2012 NeMoDe, New Economic Models in the Digital Economy, 25 October 2012 == Patents and academic publications == United States pending US10817798B2, Bruno Zamborlin & Carmine Emanuele Cella, "Method to recognize a gesture and corresponding device", published 27 April 2016, assigned to Mogees Limited GB Pending WO/2019/086862, Bruno Zamborlin; Conor Barry & Alessandro Saccoia et al., "A user interface for vehicles", published 9 May 2019, assigned to Mogees Limited GB Pending WO/2019/086863, Bruno Zamborlin; Conor Barry & Alessandro Saccoia et al., "Trigger for game events", published 9 May 2019, assigned to Mogees Limited Bevilacqua, Frédéric; Zamborlin, Bruno; Sypniewski, Anthony; Schnell, Norbert; Guédy, Fabrice; Rasamimanana, Nicolas (2010). "Continuous Realtime Gesture Following and Recognition". Gesture in Embodied Communication and Human-Computer Interaction. Lecture Notes in Computer Science. Vol. 5934. pp. 73–84. doi:10.1007/978-3-642-12553-9_7. ISBN 978-3-642-12552-2. S2CID 16251822. Retrieved 17 January 2021. Rasamimanana, Nicolas; Bevilacqua, Frédéric; Schnell, Norbert; Guédy, Fabrice; Flety, Emmanuel; Maestracci, Come; Zamborlin, Bruno (January 2010). "Modular musical objects towards embodied control of digital music". Proceedings of the fifth international conference on Tangible, embedded, and embodied interaction. Tei '11. pp. 9–12. doi:10.1145/1935701.1935704. ISBN 9781450304788. S2CID 10782645. Retrieved 17 January 2021. Bevilacqua, Frédéric; Schnell, Norbert; Rasamimanana, Nicolas; Zamborlin, Bruno; Guedy, Fabrice (2011). "Online Gesture Analysis and Control of Audio Processing". Musical Robots and Interactive Multimodal Systems. Springer Tracts in Advanced Robotics. Vol. 74. pp. 127–142. doi:10.1007/978-3-642-22291-7_8. ISBN 978-3-642-22290-0. Retrieved 17 January 2021. Zamborlin, Bruno; Bevilacqua, Frédéric; Gillies, Marco; D'Inverno, Mark (15 January 2014). "Fluid gesture interaction design: Applications of continuous recognition for the design of modern gestural interfaces". ACM Transactions on Interactive Intelligent Systems. 3 (4): 22:1–22:30. doi:10.1145/2543921. S2CID 7887245. Retrieved 17 January 2021. Leslie, Grace; Zamborlin, Bruno; Schnell, Norbert; Jodlowski, Pierre (15 June 2010). "A Collaborative, Interactive Sound Installation". Proceedings of the International Computer Music Conference. Retrieved 17 January 2021. Kimura, Mari; Rasamimanana, Nicolas; Bevilacqua, Frédéric; Zamborlin, Bruno; Schnell, Bruno; Flety, Emmanuel (2012). "Extracting Human Expression For Interactive Composition with the Augmented Violin". International Conference on New Interfaces for Musical Expression. Retrieved 17 January 2021. Ferretti, Stefano; Roccetti, Marco; Zamborlin, Bruno (13 January 2009). "On SPAWC: Discussion on a Musical Signal Parser and Well-Formed Composer". 2009 6th IEEE Consumer Communications and Networking Conference. pp. 1–5. doi:10.1109/CCNC.2009.4784966. ISBN 978-1-4244-2308-8. S2CID 14213587. Zamborlin, Bruno; Partesana, Giorgio; Liuni, Marco (15 May 2011). "(LAND)MOVES". Conference on New Interfaces for Musical Expression, NIME: 537–538. Retrieved 17 January 2021.

    Read more →
  • How to Choose an AI Website Builder

    How to Choose an AI Website Builder

    Shopping for the best AI website builder? An AI website builder is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI website builder slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • List of color palettes

    List of color palettes

    The following is a list that contains color palettes for notable computer graphics, terminals and video game consoles. Only a simulated image using a palette and its name are given. Main articles are linked from the name of each palette, test charts, sample colours, simulated images, and further technical details (including references). During older eras of computing, manufacturers developed many different display systems often in a competitive, non-collaborative basis (with a few exceptions in the VESA consortium), creating many proprietary, non-standard different instances of display hardware. Often, as with early personal and home computers, a given machine employed its unique display subsystem, also with its unique color palette. Furthermore, software developers had made use of the color abilities of distinct display systems in many different ways. The result is that there is no single common standard nomenclature or classification taxonomy which can encompass every computer color palette. In order to organize the material, color palettes have been grouped following certain criteria. First, generic monochrome and full RGB repertories common to various computer display systems are listed. Then, usual color repertories used for display systems that employ indexed color techniques. And finally, specific manufacturers' color palettes implemented in many representative early personal computers and video game consoles of various brands. The list for personal computer palettes is split into two categories: 8-bit and 16-bit machines. This is not intended as a true strict categorization of such machines, because mixed architectures also exist (16-bit processors with an 8-bit data bus or 32-bit processors with a 16-bit data bus, among others). The distinction is based more on broad 8-bit and 16-bit computer ages or generations (around 1975–1985 and 1985–1995, respectively) and their associated state of the art in color display capabilities. The following is the common color test chart and sample image used to render each palette in this list: See further details in the summary paragraph of the corresponding article. == List of monochrome and RGB palettes == In this article, the term monochrome palette means a set of intensities for a monochrome display, and the term RGB palette is defined as the complete set of combinations a given RGB display can offer by mixing all the possible intensities of the red, green, and blue primaries available in its hardware. These are generic complete repertories of colors to produce black and white and RGB color pictures by the display hardware, not necessarily the total number of such colors that can be simultaneously displayed in a given text or graphic mode of any machine. RGB is the most common method to produce colors for displays; so these complete RGB color repertories have every possible combination of R-G-B triplets within any given maximum number of levels per component. For specific hardware and different methods to produce colors than RGB, see the List of computer hardware palettes and the List of video game consoles sections. For various software arrangements and sorts of colors, including other possible full RGB arrangements within 8-bit depth displays, see the List of software palettes section. === Monochrome palettes === These palettes only have shades of gray. === Dichrome palettes === Each permuted pair of red, green, and blue (16-bit color palette, with 65,536 colors). For example, "additive red green" has zero blue and "subtractive red green" has full blue. === Regular RGB palettes === These full RGB palettes employ the same number of bits to store the relative intensity for the red, green and blue components of every image's pixel color. Thus, they have the same number of levels per channel and the total number of possible colors is always the cube of a power of two. It should be understood that 'when developed' many of these formats were directly related to the size of some host computers 'natural word length' in bytes—the amount of memory in bits held by a single memory address such that the CPU can grab or put it in one operation. === Non-regular RGB palettes === These are also RGB palettes, in the sense defined above (except for 4-bit RGBI, which has an intensity bit that affects all channels at once), but either they do not have the same number of levels for each primary channel, or the numbers are not powers of two, so are not represented as separate bit fields. All of these have been used in popular personal computers. == List of software palettes == Systems that use a 4-bit or 8-bit pixel depth can display up to 16 or 256 colors simultaneously. Many personal computers in the later 1980s and early 1990s displayed at most 256 different colors, freely selected by software (either by the user or by a program) from their wider hardware's color palette. Usual selections of colors in limited subsets (generally 16 or 256) of the full palette includes some RGB level arrangements commonly used with the 8 bpp palettes as master palettes or universal palettes (i.e., palettes for multipurpose uses). These are some representative software palettes, but any selection can be made in such types of systems. === System specific palettes === These are selections of colors officially employed as system palettes in some popular operating systems for personal computers that feature 8-bit displays. === RGB arrangements === These are selections of colors based on evenly ordered RGB levels, mainly used as master palettes to display any kind of image within the limitations of the 8-bit pixel depth. === Other common uses of software palettes === == List of computer hardware palettes == In old personal computers and terminals that offered color displays, some color palettes were chosen algorithmically to provide the most diverse set of colors for a given palette size, and others were chosen to assure the availability of certain colors. In many early home computers, especially when the palette choices were determined at the hardware level by resistor combinations, the palette was determined by the manufacturer. Many early models output composite video colors. When seen on TV devices, the perception of the colors may not correspond with the value levels for the color values employed (most noticeable with NTSC TV color system). For current RGB display systems for PCs (Super VGA, etc.), see the 16-bit RGB and 24-bit RGB for High Color (thousands) and True Color (millions of colors) modes. For video game consoles, see the List of video game consoles section. For every model, their main different graphical color modes are listed based exclusively in the way they handle colors on screen, not all their different screen modes. The list is organized roughly historically by video hardware, not by branch. They are listed according to the original model of each system, which means that extended versions, clones, and compatibles also support the original palette. === Terminals and 8-bit machines === === 16-bit machines === === Video game console palettes === Color palettes of some of the most popular video game consoles. The criteria are the same as those of the List of computer hardware palettes section.

    Read more →
  • Optical braille recognition

    Optical braille recognition

    Optical braille recognition is technology to capture and process images of braille characters into natural language characters. It is used to convert braille documents for people who cannot read them into text, and for preservation and reproduction of the documents. == History == In 1984, a group of researchers at the Delft University of Technology designed a braille reading tablet, in which a reading head with photosensitive cells was moved along set of rulers to capture braille text line-by-line. In 1988, a group of French researchers at the Lille University of Science and Technology developed an algorithm, called Lectobraille, which converted braille documents into plain text. The system photographed the braille text with a low-resolution CCD camera, and used spatial filtering techniques, median filtering, erosion, and dilation to extract the braille. The braille characters were then converted to natural language using adaptive recognition. The Lectobraille technique had an error rate of 1%, and took an average processing time of seven seconds per line. In 1993, a group of researchers from the Katholieke Universiteit Leuven developed a system to recognize braille that had been scanned with a commercially available scanner. The system, however, was unable to handle deformities in the braille grid, so well-formed braille documents were required. In 1999, a group at the Hong Kong Polytechnic University implemented an optical braille recognition technique using edge detection to translate braille into English or Chinese text. In 2001, Murray and Dais created a handheld recognition system, that scanned small sections of a document at once. Because of the small area scanned at once, grid deformation was less of an issue, and a simpler, more efficient algorithm was employed. In 2003, Morgavi and Morando designed a system to recognize braille characters using artificial neural networks. This system was noted for its ability to handle image degradation more successfully than other approaches. == Challenges == Many of the challenges to successfully processing braille text arise from the nature of braille documents. Braille is generally printed on solid-color paper, with no ink to produce contrast between the raised characters and the background paper. However, imperfections in the page can appear in a scan or image of the page. Many documents are printed inter-point, meaning they are double-sided. As such, the depressions of the braille of one side appear interlaid with the protruding braille of the other side. == Techniques == Some optical braille recognition techniques attempt to use oblique lighting and a camera to reveal the shadows of the depressions and protrusions of the braille. Others make use of commercially available document scanners.

    Read more →
  • Julia Hirschberg

    Julia Hirschberg

    Julia Hirschberg is an American computer scientist noted for her research on computational linguistics and natural language processing. She received her first PhD in history from the University of Michigan and the second from the University of Pennsylvania in computer science doing research in Natural Language Processing. She worked at Bell Labs and AT&T Bell Labs from 1985 to 2002 and from 2002 at Columbia University where she is currently the Percy K. and Vida L. W. Hudson Professor of Computer Science. == Biography == Julia Linn Bell Hirschberg received her first Ph.D. degree in history (16th-century Mexico) from University of Michigan in 1976. She served on the History faculty of Smith College from 1974 to 1982. She subsequently shifted to Computer Science studies, receiving her M.S. in Computer and Information Science from University of Pennsylvania in 1982 and a Ph.D. in Computer and Information Science from University of Pennsylvania in 1985. Upon graduation from University of Pennsylvania in 1985, Hirschberg joined AT&T Bell Labs as a Member of Technical staff in the Linguistics Research Department, where she worked on improving prosody assignment for Text-to-Speech Synthesis (TTS) in the Bell Labs TTS system. She was promoted to Department Head in 1994 when she created a new Human Computer Interface Research Lab. She and her department remained at Bell Labs until 1996 when they moved to AT&T Labs Research as part of a corporate reorganization. In 2002, she joined the Columbia University faculty as a professor in the Department of Computer Science. She served as Chair of the Computer Science Department from 2012 to 2018. She still leads classes at Columbia in speech and natural language research and supervises PhD students and a large number of research project students. == Research == Hirschberg's research has included prosody, discourse structure, conversational implicature, text-to-speech synthesis, speech summarization, spoken dialogue systems, emotional speech, deceptive speech, charismatic speech, entrainment, empathetic speech and code-switching. Hirschberg was among the first to combine Natural Language Processing (NLP) approaches to discourse and dialogue with speech research. She pioneered techniques in text analysis for prosody assignment in Text-to-Speech synthesis at Bell laboratories in the 1980s and 1990s, developing corpus-based statistical models based upon syntactic and discourse information which are in general use today in TTS systems. With Janet Pierrehumbert, she developed a theoretical model of intonational meaning. She was a leader in the development of the ToBI conventions for intonational description, which have been extended to numerous languages and which today are the most widely used standard for intonational annotation. Hirschberg has been a pioneer together with Gregory Ward in much experimental work on intonational sources of language meaning and how these interact with pragmatic phenomena, particularly on the meaning of accent (intonational prominent) items and the meaning of intonational contours. She also has innovated in numerous other areas involving prosody and meaning, including the role of grammatical function and surface position in pitch accent location, the use of prosody in disambiguating cue phrases (discourse markers) with Diane Litman, the role of prosody in disambiguation in English, Italian, and Spanish with Cinzia Avesani and Pilar Prieto, and the automatic identification of speech recognition errors using prosodic information, At AT&T Labs she worked with Fernando Pereira, Steve Whittaker, and others on speech search and developing new interfaces for speech navigation. At Columbia, she and her students have continued and extended research on spoken dialogue systems (automatically detecting speech recognition errors and inappropriate system queries, modeling turn-taking behavior, dialogue entrainment, modeling and generating clarification dialogues); on the automatic classification of trust, charisma, deception and emotion from speech; on speech summarization; prosody translation, hedging behavior in text and speech, text-to-speech synthesis, and speech search in low resource languages. She also holds several patents in TTS and in speech search. Corpora she and collaborators have collected include the Boston Directions Corpus, the Columbia SRI Colorado Deception Corpus, and the Columbia Games Corpus. She has served on numerous technical boards and editorial committees. She has served as a member of the Computing Research Association's (CRA) Board of Directors and as co-chair of CRA-W. She is also noted for her leadership in broadening participation in computing. == Awards == Hirschberg's notable honors and awards include: Elected as a member of the National Academy of Artificial Intelligence Academy of Sciences and recipient of the NAAI Artificial Intelligence Exploration Award, 2025 Elected as a Fellow of Asia-Pacific Artificial Intelligence Association (AAIA), 2024. 2020 ISCA Special Service Medal Honorary Doctorate (eredoctoraat) from Tilburg University, Netherlands, 2018. American Academy of Arts and Sciences, 2018. IEEE Fellow, 2017 National Academy of Engineering, 2017 ACM Fellow in 2015 Elected member, American Philosophical Society, 2014. Honorary member, Association for Laboratory Phonology, 2014. Association for Computational Linguistics (ACL) (Founding) Fellow, 2011. International Speech Communication Association (ISCA) Medal for Scientific Achievement, 2011. IEEE James L. Flanagan Speech and Audio Processing Award, 2011. Honorary Doctorate (Hedersdoktorer), KTH (Royal Institute of Technology) Stockholm, Sweden, 2007. AAAI Fellow, 1994. == Publications == A social history of Puebla de Los Ángeles, 1531-60, 1976 Empirical studies on the disambiguation of cue phrases, 1991 Prosody and conversation, 1998 Most recent publications and other information, https://www.cs.columbia.edu/speech/.

    Read more →
  • Michael L. Littman

    Michael L. Littman

    Michael Lederman Littman (born August 30, 1966) is a computer scientist, researcher, educator, and author. His research interests focus on reinforcement learning. He is currently a University Professor of Computer Science at Brown University, where he has taught since 2012. As of July 2025, he is also the university’s inaugural Associate Provost for Artificial Intelligence. == Career == Before graduate school, Littman worked with Thomas Landauer at Bellcore and was granted a patent for one of the earliest systems for cross-language information retrieval. Littman received his Ph.D. in computer science from Brown University in 1996. From 1996 to 1999, he was a professor at Duke University. During his time at Duke, he worked on an automated crossword solver PROVERB, which won an Outstanding Paper Award in 1999 from AAAI and competed in the American Crossword Puzzle Tournament. From 2000 to 2002, he worked at AT&T. From 2002 to 2012, he was a professor at Rutgers University; he chaired the department from 2009-12. In Summer 2012 he returned to Brown University as a full professor. He has also taught at Georgia Institute of Technology, where he was listed as an adjunct professor. Littman served as the Division Director for Information and Intelligent Systems (the AI division) at the National Science Foundation from 2022-2025. After serving a term, he returned to Brown University as their first Associate Provost for Artificial Intelligence where he coordinates the intersection of AI with research, teaching, operations, policy, and communication at the university level. == Research == Littman's research interests are varied but have focused mostly on reinforcement learning and related fields, particularly, in machine learning more generally, game theory, computer networking, partially observable Markov decision process solving, computer solving of analogy problems and other areas. He is also interested in computing education more broadly and has authored a book on programming for everyone. == Leadership and Service == Littman has chaired the panel for The One Hundred‑Year Study on Artificial Intelligence (AI100) 2021 Report and will chair the standing committee for the 2026 report. During his time at the National Science Foundation, he co-led the development of the 2023 National Strategic Artificial Intelligence Research and Development Strategic Plan. == Personal Notes == Littman is also known for his playful approach to communication. He has produced multiple education and parody videos (for example a machine-learning version of Michael Jackson’s Thriller with his oft-collaborator Charles Lee Isbell, Jr.) as part of his teaching outreach. Among his hobbies, he has been noted riding an electric unicycle to his office at the NSF. == Awards == Elected as an ACM Fellow in 2018 for "contributions to the design and analysis of sequential decision-making algorithms in artificial intelligence". Winner of the IFAAMAS Influential Paper Award (2014) Winner of the AAAI “Shakey” Award for Overfitting: Machine Learning Music Video (2014) Elected as a AAAI Fellow in 2010 for "significant contributions to the fields of reinforcement learning, decision making under uncertainty, and statistical language applications". Winner of the AAAI “Shakey” Award for Short Video for Aibo Ingenuity (2007) Winner of the Warren I. Susman Award for Excellence in Teaching at Rutgers (2011) Winner of the Robert B. Cox Award at Duke (1999) Winner of the AAAI Outstanding Paper Award (1999)

    Read more →
  • Distributed concurrency control

    Distributed concurrency control

    Distributed concurrency control is the concurrency control of a system distributed over a computer network (Bernstein et al. 1987, Weikum and Vossen 2001). In database systems and transaction processing (transaction management) distributed concurrency control refers primarily to the concurrency control of a distributed database. It also refers to the concurrency control in a multidatabase (and other multi-transactional object) environment (e.g., federated database, grid computing, and cloud computing environments. A major goal for distributed concurrency control is distributed serializability (or global serializability for multidatabase systems). Distributed concurrency control poses special challenges beyond centralized one, primarily due to communication and computer latency. It often requires special techniques, like distributed lock manager over fast computer networks with low latency, like switched fabric (e.g., InfiniBand). The most common distributed concurrency control technique is strong strict two-phase locking (SS2PL, also named rigorousness), which is also a common centralized concurrency control technique. SS2PL provides both the serializability and strictness. Strictness, a special case of recoverability, is utilized for effective recovery from failure. For large-scale distribution and complex transactions, distributed locking's typical heavy performance penalty (due to delays, latency) can be saved by using the atomic commitment protocol, which is needed in a distributed database for (distributed) transactions' atomicity.

    Read more →
  • Top 10 AI Code-review Tools Compared (2026)

    Top 10 AI Code-review Tools Compared (2026)

    In search of the best AI code-review tool? An AI code-review tool is software that uses machine learning to help you get more done — it turns a rough idea into a polished result in seconds. When choosing one, weigh output quality, pricing, export formats, and how well it fits the tools you already use. Whether you are a beginner or a pro, the right AI code-review tool slots into your workflow and pays for itself fast. Below we compare features, pricing, and real output so you can choose with confidence.

    Read more →
  • AI Logo Makers Reviews: What Actually Works in 2026

    AI Logo Makers Reviews: What Actually Works in 2026

    Shopping for the best AI logo maker? An AI logo maker is software that uses machine learning to help you get more done — it keeps getting smarter as the underlying models improve. Pricing, accuracy, and the size of the model behind the tool are the three factors that most affect daily usefulness. Whether you are a beginner or a pro, the right AI logo maker slots into your workflow and pays for itself fast. We tested the leading options and ranked them by quality, value, and ease of use.

    Read more →
  • Alberto Broggi

    Alberto Broggi

    Alberto Broggi is General Manager at VisLab srl (spinoff of the University of Parma acquired by Silicon-Valley company Ambarella Inc. in June 2015) and a professor of Computer Engineering at the University of Parma in Italy. == Research in computer vision, hardware, and AV == Broggi's research activities started in 1991–1994. His group together with the Dipartimento di Elettronica, Politecnico di Torino, Italy, built their own hardware architecture (named PAPRICA, for PArallel PRocessor for Image Checking and Analysis, based on 256 single-bit processing elements working in SIMD fashion) and installed it on board of a mobile laboratory (Mob-Lab) to develop and test some initial concepts in the field of intelligent vehicles. In 1996, Broggi's group worked to develop a real vehicle prototype (named ARGO, a Lancia Thema passenger car which was equipped with vision sensors, processing systems, and vehicle actuators) and developed the necessary software and hardware that made it able to drive autonomously on standard roads. Broggi's research group (called VisLab from then on) gathered all their findings in a book, which was then also translated in Chinese. When Broggi was with the University of Pavia, his research was extended and applied to extreme conditions (automatic driving on snow and ice): in 2001, VisLab led the research effort of providing a vehicle (RAS, Robot Antartico di Superficie) with sensing capabilities so that it was able to automatically follow the vehicle in front. In 2010 Broggi's group embarked on driving 4 vehicles autonomously from Italy to China with no human intervention. This challenge is called VIAC, for VisLab Intercontinental Autonomous Challenge . Soon after this, Broggi was awarded a second ERC grant (Proof of concept) to industrialize some of the results obtained and successfully tested on the VIAC vehicles. On July 12, 2013, VisLab tested the BRAiVE vehicle in downtown Parma, negotiating two-way narrow rural roads, pedestrian crossings, traffic lights, artificial bumps, pedestrian areas, and tight roundabouts. The vehicle traveled from Parma University Campus up to Piazza della Pilotta (downtown Parma): a 20 minutes run in a real environment, together with real traffic at 11am on a working day, that required absolutely no human intervention. Part of this test was driven with nobody in the driver seat, for the first time ever on public roads.

    Read more →
  • Pose (computer vision)

    Pose (computer vision)

    In the fields of computing and computer vision, pose (or spatial pose) represents the position and the orientation of an object, each usually in three dimensions. Poses are often stored internally as transformation matrices. The term “pose” is largely synonymous with the term “transform”, but a transform may often include scale, whereas pose does not. In computer vision, the pose of an object is often estimated from camera input by the process of pose estimation. This information can then be used, for example, to allow a robot to manipulate an object or to avoid moving into the object based on its perceived position and orientation in the environment. Other applications include skeletal action recognition. == Pose estimation == The specific task of determining the pose of an object in an image (or stereo images, image sequence) is referred to as pose estimation. Pose estimation problems can be solved in different ways depending on the image sensor configuration, and choice of methodology. Three classes of methodologies can be distinguished: Analytic or geometric methods: Given that the image sensor (camera) is calibrated and the mapping from 3D points in the scene and 2D points in the image is known. If also the geometry of the object is known, it means that the projected image of the object on the camera image is a well-known function of the object's pose. Once a set of control points on the object, typically corners or other feature points, has been identified, it is then possible to solve the pose transformation from a set of equations which relate the 3D coordinates of the points with their 2D image coordinates. Algorithms that determine the pose of a point cloud with respect to another point cloud are known as point set registration algorithms, if the correspondences between points are not already known. Genetic algorithm methods: If the pose of an object does not have to be computed in real-time a genetic algorithm may be used. This approach is robust especially when the images are not perfectly calibrated. In this particular case, the pose represent the genetic representation and the error between the projection of the object control points with the image is the fitness function. Learning-based methods: These methods use artificial learning-based system which learn the mapping from 2D image features to pose transformation. In short, this means that a sufficiently large set of images of the object, in different poses, must be presented to the system during a learning phase. Once the learning phase is completed, the system should be able to present an estimate of the object's pose given an image of the object. == Camera pose ==

    Read more →
  • Frank Hutter

    Frank Hutter

    Frank Hutter is a German computer scientist recognized for his contributions to machine learning, particularly in the areas of automated machine learning (AutoML), hyperparameter optimization, meta-learning and tabular machine learning. He is currently a Hector-Endowed Fellow and PI at the ELLIS Institute Tübingen and a Full Professor (W3) for Machine Learning at the Department of Computer Science, University of Freiburg. Hutter is known for his role in establishing AutoML as a key area in artificial intelligence research. == Education and academic career == Frank Hutter received his academic training in computer science at Darmstadt University of Technology, where he completed his Vordiplom (comparable to a BSc) and Hauptdiplom (equivalent to MSc) by 2004. He later pursued his PhD at the University of British Columbia, under the supervision of Profs. Holger Hoos, Kevin Leyton-Brown and Kevin Murphy, where his doctoral thesis, titled "Automated Configuration of Algorithms for Solving Hard Computational Problems," was awarded the CAIAC Doctoral Dissertation Award for the best thesis in Artificial Intelligence completed at a Canadian university in 2009. Hutter did his postdoctoral research at the University of British Columbia, where he worked from 2009 to 2013. In 2013, he moved to the University of Freiburg, initially leading an Emmy Noether Research Group, and in 2017, he was appointed as a Full Professor. His contributions to machine learning have been recognized globally, particularly his work in AutoML and hyperparameter optimization. Overall, Hutter has authored over 180 peer-reviewed publications, which have garnered more than 89,000 citations, reflecting the high impact of his work. == Contributions in AutoML == Hutter's early research laid the groundwork for the field of Automated Machine Learning (AutoML). He has been a key figure in establishing AutoML as a distinct research area. Along with various colleagues, he organized the AutoML workshops from 2014 to 2021, wrote the first book on AutoML and taught the first MOOC on AutoML. He also co-founded the AutoML conference in 2022 and served as its general chair the first two years. He also published prominent works in various subfields of AutoML, such as hyperparameter optimization, neural architecture search, meta-Learning and AutoML systems. He is currently the most highly cited researcher in AutoML. == Contributions in machine learning for tabular data == Hutter has also made many contributions to machine learning for tabular data. He led the development of the first widely adopted AutoML system for tabular data, AutoWEKA, which was published at KDD 2013 and received the test of time award at KDD (2023). Subsequently, he led the development of Auto-sklearn, the first highly used AutoML system for tabular data in Python, and with it, won the first international AutoML challenge and the subsequent second international AutoML challenge, both of which only included tabular data. More recently, he focused on tabular foundation models, including TabPFN, which was published in Nature magazine. In 2024, he also co-founded Prior Labs, the first company focusing on tabular foundation models. == Awards and honors == Hutter has received numerous awards throughout his career. In 2023, he won the KDD Test of Time Award for Research together with Chris Thornton, Holger H. Hoos, and Kevin Leyton-Brown. He has received three grants from the ERC, including the ERC Starting Grant (2016) and ERC Consolidator Grant (2022), as well as an ERC Proof of Concept Grant (2020). In 2021, he became an ELLIS Unit Director and was also recognized as a EurAI Fellow, in addition to receiving the AIJ Prominent Paper Award. Earlier, he was a recipient of the Google Faculty Research Award in 2018. His groundbreaking research was acknowledged early in his career with the IJCAI Distinguished Paper Award in 2013 and the IJCAI/JAIR Best Paper Prize in 2010. == Representative publications == Hutter, F. Kotthoff, L. and Vanschoren, J., editors. Automated machine learning: methods, systems, challenges, Springer Nature, 2019. www.automl.org/book. Feurer, M., Klein, A., Eggensperger, K., Springenberg, T., Blum, M., Hutter, F. Efficient and Robust Automated Machine Learning. In NeurIPS 2015. Loshchilov, I., and Hutter, F. Decoupled weight decay regularization. In ICLR 2018. Zela, A., Elsken, T. ,Saikia, T. ,Marrakschi, Y. ,Brox, T. and Hutter. ,F.Understanding and Robustifying Differentiable Architecture Search. In ICLR 2020. Hollmann, N., Müller, S., Eggensperger, K. and Hutter, F. TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second, In ICLR 2023.

    Read more →
  • IBM alignment models

    IBM alignment models

    The IBM alignment models are a sequence of increasingly complex models used in statistical machine translation to train a translation model and an alignment model, starting with lexical translation probabilities and moving to reordering and word duplication. They underpinned the majority of statistical machine translation systems for almost twenty years starting in the early 1990s, until neural machine translation began to dominate. These models offer principled probabilistic formulation and (mostly) tractable inference. The IBM alignment models were published in parts in 1988 and 1990, and the entire series is published in 1993. Every author of the 1993 paper subsequently went to the hedge fund Renaissance Technologies. The original work on statistical machine translation at IBM proposed five models, and a model 6 was proposed later. The sequence of the six models can be summarized as: Model 1: lexical translation Model 2: additional absolute alignment model Model 3: extra fertility model Model 4: added relative alignment model Model 5: fixed deficiency problem. Model 6: Model 4 combined with a HMM alignment model in a log linear way == Mathematical setup == The IBM alignment models translation as a conditional probability model. For each source-language ("foreign") sentence f {\displaystyle f} , we generate both a target-language ("English") sentence e {\displaystyle e} and an alignment a {\displaystyle a} . The problem then is to find a good statistical model for p ( e , a | f ) {\displaystyle p(e,a|f)} , the probability that we would generate English language sentence e {\displaystyle e} and an alignment a {\displaystyle a} given a foreign sentence f {\displaystyle f} . The meaning of an alignment grows increasingly complicated as the model version number grew. See Model 1 for the most simple and understandable version. == Model 1 == === Word alignment === Given any foreign-English sentence pair ( e , f ) {\displaystyle (e,f)} , an alignment for the sentence pair is a function of type { 1 , . , . . . , l e } → { 0 , 1 , . , . . . , l f } {\displaystyle \{1,.,...,l_{e}\}\to \{0,1,.,...,l_{f}\}} . That is, we assume that the English word at location i {\displaystyle i} is "explained" by the foreign word at location a ( i ) {\displaystyle a(i)} . For example, consider the following pair of sentences It will surely rain tomorrow -- 明日 は きっと 雨 だWe can align some English words to corresponding Japanese words, but not everyone:it -> ? will -> ? surely -> きっと rain -> 雨 tomorrow -> 明日This in general happens due to the different grammar and conventions of speech in different languages. English sentences require a subject, and when there is no subject available, it uses a dummy pronoun it. Japanese verbs do not have different forms for future and present tense, and the future tense is implied by the noun 明日 (tomorrow). Conversely, the topic-marker は and the grammar word だ (roughly "to be") do not correspond to any word in the English sentence. So, we can write the alignment as 1-> 0; 2 -> 0; 3 -> 3; 4 -> 4; 5 -> 1where 0 means that there is no corresponding alignment. Thus, we see that the alignment function is in general a function of type { 1 , . , . . . , l e } → { 0 , 1 , . , . . . , l f } {\displaystyle \{1,.,...,l_{e}\}\to \{0,1,.,...,l_{f}\}} . Future models will allow one English world to be aligned with multiple foreign words. === Statistical model === Given the above definition of alignment, we can define the statistical model used by Model 1: Start with a "dictionary". Its entries are of form t ( e i | f j ) {\displaystyle t(e_{i}|f_{j})} , which can be interpreted as saying "the foreign word f j {\displaystyle f_{j}} is translated to the English word e i {\displaystyle e_{i}} with probability t ( e i | f j ) {\displaystyle t(e_{i}|f_{j})} ". After being given a foreign sentence f {\displaystyle f} with length l f {\displaystyle l_{f}} , we first generate an English sentence length l e {\displaystyle l_{e}} uniformly in a range U n i f o r m [ 1 , 2 , . . . , N ] {\displaystyle Uniform[1,2,...,N]} . In particular, it does not depend on f {\displaystyle f} or l f {\displaystyle l_{f}} . Then, we generate an alignment uniformly in the set of all possible alignment functions { 1 , . , . . . , l e } → { 0 , 1 , . , . . . , l f } {\displaystyle \{1,.,...,l_{e}\}\to \{0,1,.,...,l_{f}\}} . Finally, for each English word e 1 , e 2 , . . . e l e {\displaystyle e_{1},e_{2},...e_{l_{e}}} , generate each one independently of every other English word. For the word e i {\displaystyle e_{i}} , generate it according to t ( e i | f a ( i ) ) {\displaystyle t(e_{i}|f_{a(i)})} . Together, we have the probability p ( e , a | f ) = 1 / N ( 1 + l f ) l e ∏ i = 1 l e t ( e i | f a ( i ) ) {\displaystyle p(e,a|f)={\frac {1/N}{(1+l_{f})^{l_{e}}}}\prod _{i=1}^{l_{e}}t(e_{i}|f_{a(i)})} IBM Model 1 uses very simplistic assumptions on the statistical model, in order to allow the following algorithm to have closed-form solution. === Learning from a corpus === If a dictionary is not provided at the start, but we have a corpus of English-foreign language pairs { ( e ( k ) , f ( k ) ) } k {\displaystyle \{(e^{(k)},f^{(k)})\}_{k}} (without alignment information), then the model can be cast into the following form: fixed parameters: the foreign sentences { f ( k ) } k {\displaystyle \{f^{(k)}\}_{k}} . learnable parameters: the entries of the dictionary t ( e i | f j ) {\displaystyle t(e_{i}|f_{j})} . observable variables: the English sentences { e ( k ) } k {\displaystyle \{e^{(k)}\}_{k}} . latent variables: the alignments { a ( k ) } k {\displaystyle \{a^{(k)}\}_{k}} In this form, this is exactly the kind of problem solved by expectation–maximization algorithm. Due to the simplistic assumptions, the algorithm has a closed-form, efficiently computable solution, which is the solution to the following equations: { max t ′ ∑ k ∑ i ∑ a ( k ) t ( a ( k ) | e ( k ) , f ( k ) ) ln ⁡ t ( e i ( k ) | f a ( k ) ( i ) ( k ) ) ∑ x t ′ ( e x | f y ) = 1 ∀ y {\displaystyle {\begin{cases}\max _{t'}\sum _{k}\sum _{i}\sum _{a^{(k)}}t(a^{(k)}|e^{(k)},f^{(k)})\ln t(e_{i}^{(k)}|f_{a^{(k)}(i)}^{(k)})\\\sum _{x}t'(e_{x}|f_{y})=1\quad \forall y\end{cases}}} This can be solved by Lagrangian multipliers, then simplified. For a detailed derivation of the algorithm, see chapter 4 and. In short, the EM algorithm goes as follows:INPUT. a corpus of English-foreign sentence pairs { ( e ( k ) , f ( k ) ) } k {\displaystyle \{(e^{(k)},f^{(k)})\}_{k}} INITIALIZE. matrix of translations probabilities t ( e x | f y ) {\displaystyle t(e_{x}|f_{y})} .This could either be uniform or random. It is only required that every entry is positive, and for each y {\displaystyle y} , the probability sums to one: ∑ x t ( e x | f y ) = 1 {\displaystyle \sum _{x}t(e_{x}|f_{y})=1} . LOOP. until t ( e x | f y ) {\displaystyle t(e_{x}|f_{y})} converges: t ( e x | f y ) ← t ( e x | f y ) λ y ∑ k , i , j δ ( e x , e i ( k ) ) δ ( f y , f j ( k ) ) ∑ j ′ t ( e i ( k ) | f j ′ ( k ) ) {\displaystyle t(e_{x}|f_{y})\leftarrow {\frac {t(e_{x}|f_{y})}{\lambda _{y}}}\sum _{k,i,j}{\frac {\delta (e_{x},e_{i}^{(k)})\delta (f_{y},f_{j}^{(k)})}{\sum _{j'}t(e_{i}^{(k)}|f_{j'}^{(k)})}}} where each λ y {\displaystyle \lambda _{y}} is a normalization constant that makes sure each ∑ x t ( e x | f y ) = 1 {\displaystyle \sum _{x}t(e_{x}|f_{y})=1} .RETURN. t ( e x | f y ) {\displaystyle t(e_{x}|f_{y})} .In the above formula, δ {\displaystyle \delta } is the Dirac delta function -- it equals 1 if the two entries are equal, and 0 otherwise. The index notation is as follows: k {\displaystyle k} ranges over English-foreign sentence pairs in corpus; i {\displaystyle i} ranges over words in English sentences; j {\displaystyle j} ranges over words in foreign language sentences; x {\displaystyle x} ranges over the entire vocabulary of English words in the corpus; y {\displaystyle y} ranges over the entire vocabulary of foreign words in the corpus. === Limitations === There are several limitations to the IBM model 1. No fluency: Given any sentence pair ( e , f ) {\displaystyle (e,f)} , any permutation of the English sentence is equally likely: p ( e | f ) = p ( e ′ | f ) {\displaystyle p(e|f)=p(e'|f)} for any permutation of the English sentence e {\displaystyle e} into e ′ {\displaystyle e'} . No length preference: The probability of each length of translation is equal: ∑ e has length l p ( e | f ) = 1 N {\displaystyle \sum _{e{\text{ has length }}l}p(e|f)={\frac {1}{N}}} for any l ∈ { 1 , 2 , . . . , N } {\displaystyle l\in \{1,2,...,N\}} . Does not explicitly model fertility: some foreign words tend to produce a fixed number of English words. For example, for German-to-English translation, ja is usually omitted, and zum is usually translated to one of to the, for the, to a, for a. == Model 2 == Model 2 allows alignment to be conditional on sentence lengths. That is, we have a probability distribution p a ( j | i , l e , l f ) {\displaystyle

    Read more →