The new Speaking section at a glance
According to ETS's official TOEFL content page, the Speaking section takes approximately 8 minutes and is the final section of the test. It contains 11 scored items across two tasks. There is no preparation time for any item. Every response is scored by the ETS AI scoring engine trained on human ratings, with human oversight maintained throughout the scoring process.
The Speaking section is not adaptive — unlike Reading and Listening, every student receives the same tasks at the same difficulty level. What varies is how well each student performs, which is scored on the 0 to 5 rubric for each item and then converted to the 1.0 to 6.0 band scale.
The absence of preparation time is the defining feature of the 2026 format. Students who prepared extensively for the old TOEFL — which included 15 to 30 seconds of preparation before each speaking task — need to fundamentally reorient their preparation approach. Rehearsed templates and memorized structures actively hurt scores on the 2026 format. The scoring engine is trained to detect unnatural delivery, and human raters penalize responses that sound scripted rather than spontaneous.
Source: ets.org/toefl/test-takers/ibt/about/content.html
Task 1: Listen and Repeat
Listen and Repeat
7 sentences · 8 to 12 seconds eachWhat it looks like: You see an image related to a campus or daily life scenario — a library tour, a laboratory safety briefing, directions to a campus office, or a step-by-step process. You then hear seven sentences, one at a time, that describe the scenario illustrated in the image. After each sentence, a beep sounds and you must repeat the sentence exactly as you heard it. The sentences get progressively longer and more complex across the seven items.
The task tests pronunciation, intonation, rhythm, phonological memory, and accuracy. It does not test comprehension or knowledge — the image provides context that helps you understand and remember the sentence, but the only thing being evaluated is how accurately and clearly you can reproduce what you heard.
Official ETS scoring rubric: Listen and Repeat
| Score | Description | What it means in practice |
|---|---|---|
| 5 | Perfect repetition | Fully intelligible, exact repetition of the prompt. Every word reproduced accurately with natural pronunciation and intonation. |
| 4 | Minor deviations | Mostly accurate with minor deviations that preserve the overall meaning. One small word error, slight pronunciation issue, or minor rhythm disruption that does not affect intelligibility. |
| 3 | Partial accuracy | Some content missing or changed. Intelligible but with noticeable errors that affect the completeness or accuracy of the repetition. |
| 2 | Significant errors | Multiple words missing or changed, or significant pronunciation issues that make parts of the response difficult to understand. |
| 1 | Largely inaccurate | Mostly unintelligible or bears little resemblance to the original sentence. Very limited accurate content. |
| 0 | No response | Blank, off-topic, or completely unintelligible response. |
Source: ETS official TOEFL 2026 scoring documentation at ets.org/toefl/test-takers/ibt/scores.html
Strategy for Listen and Repeat
- Use the image as a memory scaffold The image is there for a reason. Before the first sentence plays, look at it carefully and understand the scenario — a library tour, a lab procedure, a campus directions sequence. When you hear each sentence, the image gives your brain context that dramatically improves your ability to retain and reproduce the words accurately. Students who ignore the image and focus only on the audio are missing a significant memory aid.
- Practice shadowing every day Shadowing means listening to a sentence and repeating it in real time — overlapping with the speaker slightly rather than waiting until the sentence is completely finished. This is the most direct training for the phonological memory and real-time reproduction skills that Listen and Repeat tests. Use authentic English speech — podcasts, news broadcasts, academic lectures — and practice shadowing individual sentences for 10 to 15 minutes daily.
- Focus on problem sounds specifically Identify the English sounds that are genuinely difficult for your phonological system — for Spanish speakers this often includes the difference between v and b, the th sounds, and specific vowel distinctions. For Italian speakers it often includes the h sound and certain consonant clusters. Targeted pronunciation work on your specific problem sounds produces faster improvement than general pronunciation study.
- If you lose a word, self-correct immediately The ETS rubric explicitly recognizes self-correction as a positive signal. If you realize you said the wrong word, correct yourself in real time and continue. A corrected error scores better than an uncorrected one. Never stop speaking to think — keep moving through the sentence.
Practice Listen and Repeat with real audio
Hear authentic sample sentences. Record your response. Compare to the model. Free to use — no account required.
Task 2: Take an Interview
Take an Interview
4 questions · 45 seconds eachWhat it looks like: You see a short looping video of an interviewer who introduces the topic. You are told you have volunteered for a research study on a familiar everyday topic — daily routines, entertainment, city living, travel, technology, education, or similar accessible subjects. The interviewer then asks you four questions that progressively increase in complexity and abstractness. You hear each question once and must begin speaking immediately after the beep. You have 45 seconds to respond to each question.
The four questions within each interview follow a consistent progression. The first question is typically concrete and personal — what do you currently do, or what is your current situation regarding this topic. The second asks for your preferences or habits with a reason. The third asks for your evaluation or opinion — advantages, disadvantages, comparisons. The fourth is typically the most abstract — how might this change in the future, or what are the broader implications.
Official ETS scoring rubric: Take an Interview
The Take an Interview task is scored using four constructs. Each response is evaluated holistically across all four, resulting in a single 0 to 5 score per response. According to ETS official scoring guidance:
| Construct | What it measures | What scorers listen for |
|---|---|---|
| Fluency | Rate, rhythm, and continuity of speech | Approximately 150 words per minute. Natural pace without long pauses. Speech that flows without significant interruptions or hesitations. Filling close to the full 45 seconds. |
| Intelligibility | Clarity and comprehensibility of pronunciation | Every word clearly audible and understandable without effort. Natural intonation patterns. Stress placed correctly on key words. No guessing required to understand any word. |
| Language Use | Grammar accuracy and vocabulary range | Accurate grammar with appropriate variety — both simple and complex sentence structures. Natural vocabulary that is precise and varied. Connectors and transitions used correctly. No repeated overuse of basic words or patterns. |
| Organization and Relevancy | Structure and direct response to the question | Clear main idea stated early. Logical development with supporting points or examples. Response directly addresses what the question asked — not a generic answer that could fit any question. Coherent from start to finish. |
The 45-second structure that scores well
The most effective and consistent structure for Take an Interview responses uses three phases that correspond roughly to time segments within the 45 seconds:
Seconds 0 to 15 — State your main idea clearly. Answer the question directly in the first sentence. Do not introduce yourself, restate the question, or add filler. The scorer is listening for your answer from the very first word. A response that opens with "Well, that is a really interesting question" has already lost approximately three seconds and signaled scripted delivery.
Seconds 15 to 38 — Develop with one specific example or reason. Elaborate on your main idea with one concrete supporting point. A specific example from personal experience, a real observation, or a brief explanation of why your answer is true. Specificity is rewarded — a concrete detail ("I take the metro to university every morning, which takes about 35 minutes") scores better than a vague generalization ("I use public transport because it is convenient").
Seconds 38 to 45 — Close with a brief concluding point or contrast. Round off the response naturally. This does not need to be elaborate — a single sentence that acknowledges a limitation, adds a contrasting idea, or brings the response to a natural close is enough. What matters is that the response does not simply stop mid-thought when the timer ends.
Sample interview question and model answer
The following topic and question are representative of the style and difficulty level of real TOEFL 2026 Interview tasks. The model answer demonstrates the 0 to 15 / 15 to 38 / 38 to 45 structure at approximately band 5.0 level.
You have volunteered for a research study about city life and daily commuting. The researcher will ask you some questions about your experience and opinions.
"Some people prefer to live in the city center even though it is expensive. Others prefer to live further away to have more space and pay less. Which do you prefer and why?"
Personally I prefer living closer to the city center, even if it costs more. The main reason is convenience — when you live centrally you can walk or take public transport to most places, which saves a significant amount of time every day. In Barcelona, where I study, my university, the library, and most of the places I go regularly are all within twenty minutes on foot or by metro. That kind of access to things actually makes me more productive and less stressed than I would be if I had a longer commute. The one downside is obviously the cost and the noise, but for me the time saving is worth it, especially during exam periods when every hour matters.
Note: Direct answer in sentence one. Specific location detail adds authenticity. One concrete reason developed with a real example. Contrasting point at the end. Natural language throughout — no templates.Three more sample interview questions for practice
"Some people say that spending a lot of time on social media makes people feel more lonely and isolated, not less. Do you agree or disagree with this view? Why?"
"Do you think it is better to study at a university in your home country or abroad? What are the main advantages of the option you prefer?"
"Some experts predict that in the future, most people will work from home rather than going to an office. Do you think this would be a positive or negative change for society? Give one benefit and one drawback."
Strategy for Take an Interview
- Answer the question in your first sentence, always The most common mistake on Take an Interview is opening with a statement that does not answer the question. "That is an interesting point" or "There are many views on this topic" wastes time and signals scripted delivery. Your first sentence must be your answer. Train this habit from the very first day of practice — it takes two to three weeks to become automatic but produces immediate score improvement once it does.
- Fill the 45 seconds — this is a measurable target At 150 words per minute, a 45-second response contains approximately 110 to 115 words. Count your words in practice recordings. If you are consistently under 90 words, you are not developing your answers enough. If you are consistently over 120 words, you may be rushing. The target range is not arbitrary — it reflects the fluency rate the scoring engine is calibrated to reward.
- Use specific details rather than generalizations "People in big cities tend to be stressed" scores lower than "I noticed when I was in London last summer that people on the tube rarely made eye contact or spoke to each other, which felt quite different from Barcelona." Specific details — places, numbers, personal observations — demonstrate the authentic language use that both the AI scoring engine and human raters are trained to reward.
- Record yourself every day and listen back critically Most students who practice without recording cannot hear their own fluency gaps, pronunciation issues, or organizational weaknesses. Your ear compensates for what your mouth does imperfectly. Recording and listening back is genuinely uncomfortable but produces faster improvement than any other single practice habit. Focus on three things each time: did I answer the question directly, did I fill the time, and was my speech continuous without long pauses.
- Do not rehearse topic-specific content The scoring engine is specifically designed to detect memorized content. Students who prepare ten prepared answers on different topics and then adapt them to whatever question comes up score lower than students who respond genuinely in the moment. The authenticity of spontaneous speech is a scorable signal. Prepare the structure and the habits — not the content.
Practice Take an Interview with AI-generated prompts
Hear the question. Record your 45-second response. See the scoring rubric. Compare to a model answer. Free to use — no account required.
How the Speaking section is scored
The Speaking section uses a three-layer scoring system. Each of the 11 items is scored 0 to 5. The 7 Listen and Repeat scores are averaged into a Listen and Repeat task score. The 4 Interview scores are averaged into an Interview task score. The final Speaking band is the average of the two task scores, converted to the 1.0 to 6.0 scale in 0.5 increments.
Source: ets.org/toefl/test-takers/ibt/scores.html
How to prepare for TOEFL 2026 Speaking
The daily practice routine that works
Speaking improvement requires daily practice — not weekly, not three times a week, but genuinely daily. The skills that TOEFL 2026 Speaking tests are motor skills as much as language skills. Fluency, pronunciation, and the ability to organize speech under time pressure are habits that degrade quickly without consistent use and improve steadily with consistent practice.
A 20-minute daily Speaking practice session structured as follows produces faster improvement than longer, less frequent sessions: five minutes of shadowing for Listen and Repeat preparation, ten minutes of timed Interview practice recording three 45-second responses on different topics, and five minutes listening back to identify specific areas to improve. That is it. Twenty minutes. Every day for eight to twelve weeks produces a measurable band improvement for the majority of students who commit to it.
Use the free Speaking practice tool
Our free Speaking practice tool at toefl.prepdrills.com/speaking-practice provides AI-generated audio prompts for both Listen and Repeat and Take an Interview, a recording interface, and the official ETS scoring rubric alongside a model answer for each prompt. It is the closest public approximation of the real test experience available at no cost, and it is specifically built for the 2026 format.
Take the free diagnostic to find your current Speaking band
Before you build any preparation plan, find out where your Speaking band currently sits. Our free TOEFL 2026 diagnostic at toefl.prepdrills.com covers all four sections including Speaking and gives you an honest starting point within 25 to 30 minutes.
Find your TOEFL Speaking level in 25 minutes — free
All four sections. Instant results. Know your current band before you build your plan.
When to work with a teacher
Speaking is the section where teacher feedback adds the most value and is the hardest to replicate through self-study alone. The reason is simple: you cannot objectively hear your own speaking. Your brain fills in gaps, compensates for pronunciation errors, and interprets your own delivery more charitably than the scoring engine does.
A certified TOEFL teacher who knows the 2026 format can identify in a single session the specific patterns that are costing you points — whether it is a consistent hesitation before certain sounds, a tendency to underload your responses with content, a grammatical structure you overuse, or a pronunciation habit that affects intelligibility in ways you cannot hear yourself. For students targeting band 5.0 or above — the threshold for most exchange programs, MBA applications, and top university admissions in Spain, Italy, France, and beyond — at least a few sessions with an expert teacher dramatically accelerates progress.
Epic Exam Prep is based in Barcelona and has been preparing students for TOEFL Speaking since 2010, with one-to-one sessions and monthly group courses for students across Europe and beyond. Their teachers specialize in the 2026 format and provide the kind of targeted feedback that the app tool cannot replicate.
Ready to improve your TOEFL Speaking?
Practice with real audio prompts, find your current band with a free diagnostic, and connect with expert teachers who know the 2026 format.