beckr.dev - HF Posts

about an hour ago

I believe in order to make models reach Human-Level Learning, serious students can start by developing an intelligent neuromorphic agent. We develop an intelligent agent and make it learn about grammar patterns as well as about different word categories through symbolic representations, following which we dwell into making the agent learn about other rules of the Language.

In parallel with grammar learning, the agent would also use language grounding techniques to link words to their sensory representations and abstract concepts which would mean the agent learns about the word meanings, synonyms, antonyms, and semantic relationships from both textual data as well as perceptual experiences.

The result would be the agent developing a rich lexicon and conceptual knowledge base that underlies its language understanding as well as generation. With this basic knowledge of grammar and word meanings, the agent can then learn to synthesize words and phrases so as to express specific ideas or concepts. Building on this, the agent would then learn how to generate complete sentences which the agent would continuously refine and improve. Eventually the agent would learn how to generate sequence of sentences in the form of dialogues or narratives, taking into account context, goals, as well as user-feedback.

I believe that by gradually learning how to improve their responses, the agent would gradually also acquire the ability to generate coherent, meaningful, and contextually appropriate language. This would allow them to reason without hallucinating which LLMs struggle at.

Developing such agents would not require a lot of compute and the code would be simple & easy to understand. It will definitely introduce everyone to symbolic AI and making agents which are good at reasoning tasks. Thus solving a crucial problem with LLMs. We have used a similar architecture to make our model learn constantly. Do sign up as we start opening access next week at

https://octave-x.com/

🤗 1

🚀 1

tonywu71

about 3 hours ago

ColPali: A new approach to efficient and intelligent document retrieval 🚀

Our latest research paper, "ColPali: Efficient Document Retrieval with Vision Language Models," introduces a groundbreaking approach to large-scale visual document analysis. By leveraging Vision Language Models (VLMs), we have created a new framework for document retrieval that's both powerful and efficient.

Key Insights:
💡 ColPali combines ColBERT's multi-vector strategy with VLMs' document understanding capabilities
⚙️ ColPali is based on PaliGemma-3B (SigLIP, Gemma-2B) + a linear projection layer and is trained to maximize the similarity between the document and the query embeddings
📊 The Vision Document Retrieval benchmark (ViDoRe) is a challenging dataset that spans various industry topics and aims at matching real-life retrieval scenarios
🏆 ColPali outperforms existing models on all datasets in ViDoRe (average NDCG@5 of 81.3% vs 67.0% for the best baseline model)
⚡ ColPali is faster at document embedding compared to traditional PDF parser pipelines, making ColPali viable for industrial use
🔍 ColPali is highly interpretable thanks to patch-based similarity maps

Dive deeper into ColPali and explore our resources:
📑 Full paper: arxiv.org/abs/2407.01449
🛠️ Datasets, model weights, evaluation code, leaderboard, demos: huggingface.co/vidore

Shoutout to my amazing co-authors Manuel Faysse ( @manu ) and Hugues Sibille ( @HugSib ). We are grateful for the invaluable feedback from Bilel Omrani, Gautier Viaud, Celine Hudelot, and Pierre Colombo. This work is sponsored by ILLUIN Technology. ✨

ezgikorkmaz

about 4 hours ago

If you are interested in deep reinforcement learning, find my recent survey below:

A Survey Analyzing Generalization in Deep Reinforcement Learning
Paper:

https://arxiv.org/pdf/2401.02349

GitHub:

https://github.com/EzgiKorkmaz/generalization-reinforcement-learning

DmitryRyumin

about 7 hours ago

🚀🕺🌟 New Research Alert (Avatars Collection)! 🌟💃🚀
📄 Title: Expressive Gaussian Human Avatars from Monocular RGB Video 🔝

📝 Description: The new EVA model enhances the expressiveness of digital avatars by using 3D Gaussians and SMPL-X to capture fine-grained hand and face details from monocular RGB video.

👥 Authors: Hezhen Hu, Zhiwen Fan, Tianhao Wu, Yihan Xi, Seoyoung Lee, Georgios Pavlakos, and Zhangyang Wang

📄 Paper:

2407.03204

🌐 Github Page:

https://evahuman.github.io/

📁 Repository:

https://github.com/evahuman/EVA

🚀 CVPR-2023-24-Papers:

https://github.com/DmitryRyumin/CVPR-2023-24-Papers

🚀 WACV-2024-Papers:

https://github.com/DmitryRyumin/WACV-2024-Papers

🚀 ICCV-2023-Papers:

https://github.com/DmitryRyumin/ICCV-2023-Papers

📚 More Papers: more cutting-edge research presented at other conferences in the

DmitryRyumin/NewEraAI-Papers

curated by @DmitryRyumin

🚀 Added to the Avatars Collection:

DmitryRyumin/avatars-65df37cdf81fec13d4dbac36

🔍 Keywords: #DigitalAvatars #3DModeling #ComputerVision #MonocularVideo #SMPLX #3DGaussians #AvatarExpressiveness #HandTracking #FacialExpressions #AI #MachineLearning

🔥 3

🚀 1

iofu728

about a day ago

Weclome to use MInference, which leverages the dynamic sparse nature of LLMs' attention, which exhibits some static patterns, to speed up the pre-filling for million tokens LLMs. It first determines offline which sparse pattern each head belongs to, then approximates the sparse index online and dynamically computes attention with the optimal custom kernels. This approach achieves up to a 10x speedup for pre-filling on an A100 while maintaining accuracy with 1M tokens.

For more detail please check,
project page:

https://aka.ms/MInference

code:

https://github.com/microsoft/MInference

paper:

2407.02490

hf demo:

microsoft/MInference

🔥 1

fdaudens

about a day ago

New dataset filtering feature just dropped! 🤗🚀

Find exactly what you need with filters for:
- Modalities (text, image, audio, etc.)
- Dataset size
- File format

Try it now:

https://huggingface.co/datasets

What other filters would you find useful? Drop your ideas!

🔥 1

victor

about a day ago

Hi @jonoirwin ! Big fan of

https://fastvoiceagent.cerebrium.ai/

🔥
I'd be super happy to give you a GPU grant to host it on a Space, it would allow more people to discover and use it!

❤️ 2

gokaygokay

about a day ago

I've created a space for chatting with Gemma 2 using llama.cpp

- 🎛️ Choose between 27B IT and 9b IT models
- 🚀 Fast inference using llama.cpp

-

gokaygokay/Gemma-2-llamacpp

👍 9

tomaarsen

about a day ago

@Omartificial-Intelligence-Space has trained and released 6 Arabic embedding models for semantic similarity. 4 of them outperform all previous models on the STS17 Arabic-Arabic task!

📚 Trained on a large dataset of 558k Arabic triplets translated from the AllNLI triplet dataset:

Omartificial-Intelligence-Space/Arabic-NLi-Triplet

6️⃣ 6 different base models: AraBERT, MarBERT, LaBSE, MiniLM, paraphrase-multilingual-mpnet-base, mpnet-base, ranging from 109M to 471M parameters.
🪆 Trained with a Matryoshka loss, allowing you to truncate embeddings with minimal performance loss: smaller embeddings are faster to compare.
📈 Outperforms all commonly used multilingual models like

intfloat/multilingual-e5-large

sentence-transformers/paraphrase-multilingual-mpnet-base-v2

, and

sentence-transformers/LaBSE

.

Check them out here:
-

Omartificial-Intelligence-Space/Arabic-mpnet-base-all-nli-triplet

Omartificial-Intelligence-Space/Arabic-all-nli-triplet-Matryoshka

Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka

Omartificial-Intelligence-Space/Arabic-labse-Matryoshka

Omartificial-Intelligence-Space/Marbert-all-nli-triplet-Matryoshka

Omartificial-Intelligence-Space/Arabic-MiniLM-L12-v2-all-nli-triplet

Or the collection with all:

Omartificial-Intelligence-Space/arabic-matryoshka-embedding-models-666f764d3b570f44d7f77d4e

My personal favourite is likely

Omartificial-Intelligence-Space/Arabert-all-nli-triplet-Matryoshka

: a very efficient 135M parameters & scores #1 on

mteb/leaderboard

🔥 7

👍 2

alvdansen

about a day ago

I really like what the @jasperAITeam designed with Flash LoRA. It works really well for something that generates so quickly, and I'm excited to test it out with Animate Diff, because I recently was testing LCM on it's own for AD and the results were already promising.

I put together my own page of models using their code and LoRA. Enjoy!

alvdansen/flash-lora-araminta-k-styles

❤️ 2