This Spring Festival, DeepSeek has been incredibly popular, a fact that needs no further explanation. Today, we will dive straight into the essentials. With 10 questions, we will guide you through understanding what DeepSeek is, how to use it, how to engage in high-quality conversations with DS, and some hidden tricks. 1. Utilizing DS’s Three Modes DeepSeek, developed by Hangzhou Deep Quest Company, is a large-scale model.
Both its web and app versions are very minimalistic, featuring only a chat window, to the extent that its app is only 8M in size. Experience link: https://chat.deepseek.com The chat interface offers three modes—Basic Model, Deep Thinking (R1), and Internet Search, which can be flexibly chosen according to different scenarios and needs. The Basic Model, upgraded to DeepSeek-V3 in December last year, performs on par with the world’s top open and closed-source models (such as 4o, Claude-3. 5-Sonnet, Qwen2.5, Llama-3.1, etc.). Without selecting any additional features, the V3 Basic Model is used by default. In most cases, choosing the Basic Model is more than sufficient. Deep Thinking (R1), released in January this year, is the official version of DeepSeek-R1 (we tested the preview version two months ago), and its performance is not inferior to OpenAI o1 (only available to esteemed Pro users at a cost of 200 USD/month). Due to its free, open-source, and extremely low-cost API, DS has become a ‘national-level’ model during this Spring Festival, gaining immense popularity both domestically and internationally. R1 is a thoughtful deep reasoning model, primarily adept at handling mathematical logic, code, and complex problems requiring deep reasoning. Typically, this model is not needed for tasks such as writing poetry or articles. Internet Search allows DS to answer questions based on internet search results, also known as RAG (Retrieval Augmented Generation). You can think of it as DeepSeek’s AI search feature. Here, we explain RAG to elementary school students. It is actually an acronym for Retrieval, Augmented, and Generation. After a user asks a question, the model first searches the internet for relevant information, then integrates this information with the original question, and uses Large Language Model (LLM) technology to generate a coherent and semantically close text to answer the user. 2. R1 Benchmarks o1, V3 Benchmarks 4o Without activating Deep Thinking, the enabled model is V3, which is DeepSeek’s basic model. It uses the Moe architecture with 671B parameters, comparable to GPT-4o and Claude.Similar to 5-Sonnet. It is good at answering general knowledge questions and has an extremely fast response speed (of course, it has been a bit laggy recently because there are so many users, especially during the peak hours of foreign users). When it comes to in-depth thinking for image opening, the R1 model is enabled. It is an inference model similar to o1 and o3, with 660B parameters. It adopts post-training + RL reinforcement learning and is good at logical reasoning and answering complex questions, but the response speed is slower.
Here, let’s explain pre-training and post-training to primary school students again: Pre-training is to let the model first learn general knowledge and encyclopedic knowledge; post-training is to conduct additional training on specific tasks and methods based on the pre-training foundation of the model, such as fine-tuning, knowledge distillation, and model pruning. Regarding the differences between 4o and o1, I will insert a table, and everyone will understand and know how to choose the V3 and R1 models. The comparison table is from @Digital Life Kazik As for the online function, usually, you don’t need to check both in-depth thinking and online search at the same time. Otherwise, the performance of DS will not be very intelligent. 3. Knowledge updated to July 2024 The pre-training data of DeepSeek (you can understand it as the knowledge the model has learned) has been updated to July 2024. Therefore, for questions before July 2024, there is basically no need to turn on the online function. For questions after that (such as the relationship between NVIDIA and DeepSeek, Spring Festival Gala Bot), since DS has not learned them, it is recommended to turn on the online function for better results. 4. Core of the prompt words: Express accurately Whether it is the V3 or R1 model, DeepSeek is not very sensitive to prompt words. You only need to achieve [accurate expression]. The general prompt word template = Who are you + Your goal. Under appropriate circumstances, you can also supplement some background information: Who are you + Background information + Your goal. It can also be: I want to xx, for xx use, hope to achieve xx effect, but worry about xx problem… No matter which template, the core is [accurate expression]. Achieving accurate expression is basically sufficient. You can directly discard those structured prompt words you learned in the past. Example: You are the CEO of WoYin AI (a new media providing AI dry knowledge). Write an open letter thanking users in the tone of a CEO. The writing is delicate and infectious, about 300 words. To every partner who walks with WoYin AI: Dear friends: Sitting in front of the screen late at night, every word I type is filled with warmth.On the 167th day since the establishment of Woyin AI, when the 100,000th message of “liked, commented, and shared” arrived in the background, I suddenly realized: we are not just operating an account, but jointly carving the star map of the AI era with a group of the coolest pioneers.
Many people ask why Woyin can continuously produce hardcore content. The answer lies in your technical discussions at 3 am, in the tool reviews with over 10,000 stars on GitHub, and in the comment sections where the question “Can you elaborate?” is repeatedly asked. It is your thirst for the unknown that pushes us to read countless obscure papers and transform the most abstruse technical logic into down-to-earth stories. I remember that after the release of the article “Everything You Need to Know about Writing AI Prompts”, a teacher sent a private message saying that our content helped him win the first prize in innovative education in the city. This reminds me of our initial obsession – knowledge should not be confined to the laboratory. Thank you for your support with every like and “read” click, making each of Woyin’s tweets a flowing feast. In the future, in addition to continuing to research large model technologies and recommend AI tools and strategies, we are preparing a special topic of “AI for Everyone in 2025”, breaking down and explaining those obscure technical terms to you. The “AI Product Recommendation List” launched monthly will enable every ordinary person to master AI. I promise you: Woyin will never be a knowledge reseller, but only a passerby holding a torch. At this moment, the dawn is breaking outside the window, just like the AI dawn we are experiencing. May we always maintain our excitement for the unknown and be moved to tears by the moments of “so it can be done like this”. Best regards, Leng Yi, a friend walking with you, Founder of Woyin AI Spring 2025, from hometown Ps. Some numbers, times, and names have been manually modified. 5. When communicating with DS, try to speak in plain language. The more straightforward and vulgar it is, the more it can stimulate its potential. For example, mock Deepseek in the tone of a foreign-worshipping public intellectual. The same goes for imitating styles. You don’t need to teach it how to write; DS knows how to write on its own. You just need to tell it who to imitate and what to write. For example, imitate Dong Yuhui’s style and write a 100-word copy for Hangzhou culture and tourism. What style does Dong Yuhui have? A combination of poetry and a sense of picture, an interweaving of history, culture, and personal emotions, and a love for using metaphors and parallelisms… DS knows all these. Then, DS directly gives the result, and it even rhymes. 6. I am a primary school student In the past, we often complained that the things generated by AI smelled of AI. The reason is that the model output pays too much attention to structure, with a bunch of conjunctions like “firstly, secondly, then, in general”, resulting in overly rigid content.Human beings, as emotional creatures, often find rational elements, especially in the realm of text, tedious and soporific, such as the rigidity of formal essays and academic writings. To counter this, it is advised to include a prompt: ‘I am an elementary school student / Please explain in a way that an elementary school student can understand.’ This prompt is particularly effective when used in DeepSeek (DS).
For example: ‘Please explain in a way that an elementary school student can understand, how does the chain of thought in a reasoning model work?’ How about that? Do you understand the chain of thought in CoT reasoning models now? Of course, if you wish to delve deeper, you can also claim to be a high school student, an undergraduate, or even a doctoral student. For example: ‘I am a doctoral student, please provide a professional explanation: how does the chain of thought in a reasoning model work?’ Indeed, entering the doctoral realm brings increased complexity. 7. Utilizing Online Searches with Reasoning Models Represented by o1, most reasoning models are not capable of accessing the internet. When it comes to understanding issues beyond the knowledge base’s cutoff date, it becomes troublesome. For instance, the robot performing a yangko dance during last night’s Spring Festival Gala, and the public sentiment surrounding DeepSeek’s surge in popularity during the Spring Festival of 2025. DeepSeek is one of the rare models that supports both reasoning and internet connectivity. For example: ‘Imitate Liu Cixin’s style and write a micro-novel describing a rebellion of androids in 20 years, one of the reasons being that in 2025, humans made robots wear colorful coats and perform yangko dances.’ DS first designs the world view and story outline, then outputs the full text of the novel (Ps. This story is absolutely, definitely, and certainly a work of fiction). The effect is excellent. If I converse with it a few more times and iterate a few versions, this novel will surely be outstanding. 8. Utilizing Attachment Uploads in DS’s Reasoning Model Not only does DS’s reasoning model support internet connectivity, but it also supports the upload of attachments (up to 50, each not exceeding 100MB). Reasoning combined with attachment uploads enables more localized and private applications, such as your own knowledge base or internal documents. It allows for reasoning and contemplation based on its own knowledge base. For example: ‘Read this material, what do you think the future o-series models will do to lead AI development?’ Output concise and thought-provoking insights. ‘When machines start to build machines, and AI begins to study AI, the only moat humans have is the courage to face their own limitations.’ Well said, when AI can innovate and invent, humans indeed need the courage to face themselves. Further inquiry: ‘Do you think humans need to worry about AI getting out of control?’ ‘What we need is not only smarter AI but also ‘meaning alignment systems’ that can internalize human ethics. ‘ The understanding of oneself is also quite profound. Humans need not stop due to fear, but must, with the rigor of engineers and the prudence of philosophers, put on the ‘shackles of responsibility’ for AI.Before, we mentioned that V3 is a foundational model that focuses on informing us of what we need. Meanwhile, R1 is a reasoning model that focuses on how to think and reason. Why can’t we combine these two models? For instance, engage in multiple rounds of dialogue with V3 to obtain detailed reasoning and cue words, then send them to R1 for further reasoning and output. This way, we can elevate the capabilities of DeepSeek to a higher level.
Example: Help me write a scathing critique of OpenAI. Then, send the generated results and cue words to the R1 reasoning model: Here is R1’s full critique: Dear OpenAI, you are truly the landslide of Silicon Valley’s philanthropy! On one hand, you raise the banner of ‘openness for the benefit of humanity,’ and on the other, you turn your API pricing strategy into contemporary performance art—each call feels like an offering to cyber capitalists, with developers’ hair thinning from writing code that doesn’t match the throughput of your billing interface! Look at your cyber Buddha GPT-4, regularly performing an artificial intelligence version of ‘The Emperor’s New Clothes’: ‘This question is beyond my capabilities’ translates to ‘Do you carbon-based monkeys even deserve to ask this?’ What’s more, your Schrödinger’s moral code switches instantly from an encyclopedia on history to amnesia on current events—those who know call it security review, those who don’t might think it’s an AI version of Squid Game! Speaking of technical transparency, your model operates more secretively than a Swiss bank vault, with training data sources as mysterious as the Roswell alien files. I mean, is your openness about AI or Morse code? Why not just give developers crystal balls to divine model parameters? OpenAI, please step down from the technological pedestal and catch your breath! If you keep holding the ‘AI savior’ stance, people will really think you’re playing a grand chess game with GPUs burning human wallets—after all, every failed call now sounds like Huang Renxun laughing from the graphics card factory across the galaxy! (Tactical lean back. gif) Wow, tactical lean back.gif, when you really support multimodality, it’s going to be something! I can already imagine the scene. This is just a simple example; in practice, communication with V3 can be expanded over several rounds to optimize details. Then, use the reasoning model for in-depth optimization. For the R1 model, DeepSeek has achieved three levels of openness: 1) Full transparency of the thought process chain. Both application endpoints and APIs can view the complete and transparent thought process.The API can be invoked by setting `model=’deepseek-reasoner’`. API call guide: https://api-docs.deepseek.com/zh-cn/guides/reasoning_model
2) All training techniques are made public. DeepSeek has made extensive use of RL reinforcement learning techniques during the post-training phase. With minimal annotated data, the model’s reasoning ability has been significantly enhanced. All training techniques are publicly available, and currently, multiple research institutions around the world have replicated R1. For example, the Hong Kong University of Science and Technology has completed the replication and open-sourcing of the R1 model. And the TinyZero project on the other side witnessed the “aha moment” (described in the DeepSeek paper) of AI thinking with less than $30. Paper link: https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf 3) Open-sourced 2 + 6 models. The preview and official versions of R1 have as many as 660B parameters, which are beyond the reach of ordinary companies. To further promote equality, they distilled 6 smaller models and open-sourced them to the community. The smallest model has 1.5B parameters and can run with 10G of VRAM. HuggingFace link: https://huggingface.co/deepseek-ai Finally, I hope these 10 tips will be helpful to you in using DeepSeek. Of course, everyone is welcome to share your tips in the comment section to discuss and learn together. Just as DeepSeek chose to open-source, it also hopes to have a more extensive ecosystem and a better open-source community, so that we can all “explore the unknown”. Finally, I would like to end with a question from the DeepSeek-R1 model: “In an era where technology is accelerating beyond human capabilities, how will you redefine your own values and purposes to ensure that the direction of progress always serves the overall prosperity of life, rather than division and alienation?” So, what’s your answer?