TOEFL 2026 Speaking

TOEFL 2026 Speaking Section:
The Complete Guide

The TOEFL 2026 Speaking section is unlike any previous version of the test. The four familiar tasks with preparation time are completely gone. In their place are two tasks that test exactly one thing: your ability to speak naturally, accurately, and fluently in real time with zero preparation. This guide covers both task types in full, the official ETS scoring rubrics, what band 5.0 actually looks like in practice, and the preparation strategies that produce real improvement for students who need TOEFL for exchange programs, MBA applications, and university admissions in the US, UK, and Europe.

The new Speaking section at a glance

According to ETS's official TOEFL content page, the Speaking section takes approximately 8 minutes and is the final section of the test. It contains 11 scored items across two tasks. There is no preparation time for any item. Every response is scored by the ETS AI scoring engine trained on human ratings, with human oversight maintained throughout the scoring process.

8 Minutes total
11 Scored items
2 Task types
0 Seconds prep time

The Speaking section is not adaptive — unlike Reading and Listening, every student receives the same tasks at the same difficulty level. What varies is how well each student performs, which is scored on the 0 to 5 rubric for each item and then converted to the 1.0 to 6.0 band scale.

The absence of preparation time is the defining feature of the 2026 format. Students who prepared extensively for the old TOEFL — which included 15 to 30 seconds of preparation before each speaking task — need to fundamentally reorient their preparation approach. Rehearsed templates and memorized structures actively hurt scores on the 2026 format. The scoring engine is trained to detect unnatural delivery, and human raters penalize responses that sound scripted rather than spontaneous.

Source: ets.org/toefl/test-takers/ibt/about/content.html


Task 1: Listen and Repeat

Listen and Repeat

7 sentences · 8 to 12 seconds each
Items7 sentences
Prep timeNone
Recording time8 to 12 seconds
TopicCampus or daily life
Visual aidImage shown

What it looks like: You see an image related to a campus or daily life scenario — a library tour, a laboratory safety briefing, directions to a campus office, or a step-by-step process. You then hear seven sentences, one at a time, that describe the scenario illustrated in the image. After each sentence, a beep sounds and you must repeat the sentence exactly as you heard it. The sentences get progressively longer and more complex across the seven items.

The task tests pronunciation, intonation, rhythm, phonological memory, and accuracy. It does not test comprehension or knowledge — the image provides context that helps you understand and remember the sentence, but the only thing being evaluated is how accurately and clearly you can reproduce what you heard.

Official ETS scoring rubric: Listen and Repeat

ScoreDescriptionWhat it means in practice
5 Perfect repetition Fully intelligible, exact repetition of the prompt. Every word reproduced accurately with natural pronunciation and intonation.
4 Minor deviations Mostly accurate with minor deviations that preserve the overall meaning. One small word error, slight pronunciation issue, or minor rhythm disruption that does not affect intelligibility.
3 Partial accuracy Some content missing or changed. Intelligible but with noticeable errors that affect the completeness or accuracy of the repetition.
2 Significant errors Multiple words missing or changed, or significant pronunciation issues that make parts of the response difficult to understand.
1 Largely inaccurate Mostly unintelligible or bears little resemblance to the original sentence. Very limited accurate content.
0 No response Blank, off-topic, or completely unintelligible response.

Source: ETS official TOEFL 2026 scoring documentation at ets.org/toefl/test-takers/ibt/scores.html

The scoring reality most students do not know A single word error reduces your Listen and Repeat score from 5 to 4. This means perfection is the only path to a top Listen and Repeat task score. However — and this matters — a 4 on Listen and Repeat is still a strong score that supports a high Speaking band overall. Do not let minor errors derail your concentration on subsequent items. Move on immediately, focus on the next sentence, and deliver it as accurately as possible.

Strategy for Listen and Repeat

  • 1 Use the image as a memory scaffold The image is there for a reason. Before the first sentence plays, look at it carefully and understand the scenario — a library tour, a lab procedure, a campus directions sequence. When you hear each sentence, the image gives your brain context that dramatically improves your ability to retain and reproduce the words accurately. Students who ignore the image and focus only on the audio are missing a significant memory aid.
  • 2 Practice shadowing every day Shadowing means listening to a sentence and repeating it in real time — overlapping with the speaker slightly rather than waiting until the sentence is completely finished. This is the most direct training for the phonological memory and real-time reproduction skills that Listen and Repeat tests. Use authentic English speech — podcasts, news broadcasts, academic lectures — and practice shadowing individual sentences for 10 to 15 minutes daily.
  • 3 Focus on problem sounds specifically Identify the English sounds that are genuinely difficult for your phonological system — for Spanish speakers this often includes the difference between v and b, the th sounds, and specific vowel distinctions. For Italian speakers it often includes the h sound and certain consonant clusters. Targeted pronunciation work on your specific problem sounds produces faster improvement than general pronunciation study.
  • 4 If you lose a word, self-correct immediately The ETS rubric explicitly recognizes self-correction as a positive signal. If you realize you said the wrong word, correct yourself in real time and continue. A corrected error scores better than an uncorrected one. Never stop speaking to think — keep moving through the sentence.
🎙

Practice Listen and Repeat with real audio

Hear authentic sample sentences. Record your response. Compare to the model. Free to use — no account required.

Open Practice Tool →

Task 2: Take an Interview

Take an Interview

4 questions · 45 seconds each
Questions4 per interview
Response time45 seconds
Prep timeNone
FormatSimulated video interview
Target speed~150 wpm

What it looks like: You see a short looping video of an interviewer who introduces the topic. You are told you have volunteered for a research study on a familiar everyday topic — daily routines, entertainment, city living, travel, technology, education, or similar accessible subjects. The interviewer then asks you four questions that progressively increase in complexity and abstractness. You hear each question once and must begin speaking immediately after the beep. You have 45 seconds to respond to each question.

The four questions within each interview follow a consistent progression. The first question is typically concrete and personal — what do you currently do, or what is your current situation regarding this topic. The second asks for your preferences or habits with a reason. The third asks for your evaluation or opinion — advantages, disadvantages, comparisons. The fourth is typically the most abstract — how might this change in the future, or what are the broader implications.

Official ETS scoring rubric: Take an Interview

The Take an Interview task is scored using four constructs. Each response is evaluated holistically across all four, resulting in a single 0 to 5 score per response. According to ETS official scoring guidance:

ConstructWhat it measuresWhat scorers listen for
Fluency Rate, rhythm, and continuity of speech Approximately 150 words per minute. Natural pace without long pauses. Speech that flows without significant interruptions or hesitations. Filling close to the full 45 seconds.
Intelligibility Clarity and comprehensibility of pronunciation Every word clearly audible and understandable without effort. Natural intonation patterns. Stress placed correctly on key words. No guessing required to understand any word.
Language Use Grammar accuracy and vocabulary range Accurate grammar with appropriate variety — both simple and complex sentence structures. Natural vocabulary that is precise and varied. Connectors and transitions used correctly. No repeated overuse of basic words or patterns.
Organization and Relevancy Structure and direct response to the question Clear main idea stated early. Logical development with supporting points or examples. Response directly addresses what the question asked — not a generic answer that could fit any question. Coherent from start to finish.

The 45-second structure that scores well

The most effective and consistent structure for Take an Interview responses uses three phases that correspond roughly to time segments within the 45 seconds:

Seconds 0 to 15 — State your main idea clearly. Answer the question directly in the first sentence. Do not introduce yourself, restate the question, or add filler. The scorer is listening for your answer from the very first word. A response that opens with "Well, that is a really interesting question" has already lost approximately three seconds and signaled scripted delivery.

Seconds 15 to 38 — Develop with one specific example or reason. Elaborate on your main idea with one concrete supporting point. A specific example from personal experience, a real observation, or a brief explanation of why your answer is true. Specificity is rewarded — a concrete detail ("I take the metro to university every morning, which takes about 35 minutes") scores better than a vague generalization ("I use public transport because it is convenient").

Seconds 38 to 45 — Close with a brief concluding point or contrast. Round off the response naturally. This does not need to be elaborate — a single sentence that acknowledges a limitation, adds a contrasting idea, or brings the response to a natural close is enough. What matters is that the response does not simply stop mid-thought when the timer ends.

Sample interview question and model answer

The following topic and question are representative of the style and difficulty level of real TOEFL 2026 Interview tasks. The model answer demonstrates the 0 to 15 / 15 to 38 / 38 to 45 structure at approximately band 5.0 level.

Interview introduction

You have volunteered for a research study about city life and daily commuting. The researcher will ask you some questions about your experience and opinions.

Question 2 of 4

"Some people prefer to live in the city center even though it is expensive. Others prefer to live further away to have more space and pay less. Which do you prefer and why?"

Model answer — approximately band 5.0 — 43 seconds

Personally I prefer living closer to the city center, even if it costs more. The main reason is convenience — when you live centrally you can walk or take public transport to most places, which saves a significant amount of time every day. In Barcelona, where I study, my university, the library, and most of the places I go regularly are all within twenty minutes on foot or by metro. That kind of access to things actually makes me more productive and less stressed than I would be if I had a longer commute. The one downside is obviously the cost and the noise, but for me the time saving is worth it, especially during exam periods when every hour matters.

Note: Direct answer in sentence one. Specific location detail adds authenticity. One concrete reason developed with a real example. Contrasting point at the end. Natural language throughout — no templates.

Three more sample interview questions for practice

Topic: Technology and daily life — Question 3

"Some people say that spending a lot of time on social media makes people feel more lonely and isolated, not less. Do you agree or disagree with this view? Why?"

Topic: Education and learning — Question 2

"Do you think it is better to study at a university in your home country or abroad? What are the main advantages of the option you prefer?"

Topic: Work and career — Question 4

"Some experts predict that in the future, most people will work from home rather than going to an office. Do you think this would be a positive or negative change for society? Give one benefit and one drawback."

Why these topics are chosen deliberately The Take an Interview topics are designed to be answerable by any educated adult regardless of their academic background or nationality. You do not need specialist knowledge about economics, science, or any technical field. Every topic — city life, technology, education, travel, entertainment — is something any student applying to study abroad or pursue an MBA has genuine personal experience with. The challenge is not what to say. It is saying it clearly, fluently, and in an organized way within 45 seconds without preparation time.

Strategy for Take an Interview

  • 1 Answer the question in your first sentence, always The most common mistake on Take an Interview is opening with a statement that does not answer the question. "That is an interesting point" or "There are many views on this topic" wastes time and signals scripted delivery. Your first sentence must be your answer. Train this habit from the very first day of practice — it takes two to three weeks to become automatic but produces immediate score improvement once it does.
  • 1 Fill the 45 seconds — this is a measurable target At 150 words per minute, a 45-second response contains approximately 110 to 115 words. Count your words in practice recordings. If you are consistently under 90 words, you are not developing your answers enough. If you are consistently over 120 words, you may be rushing. The target range is not arbitrary — it reflects the fluency rate the scoring engine is calibrated to reward.
  • 3 Use specific details rather than generalizations "People in big cities tend to be stressed" scores lower than "I noticed when I was in London last summer that people on the tube rarely made eye contact or spoke to each other, which felt quite different from Barcelona." Specific details — places, numbers, personal observations — demonstrate the authentic language use that both the AI scoring engine and human raters are trained to reward.
  • 4 Record yourself every day and listen back critically Most students who practice without recording cannot hear their own fluency gaps, pronunciation issues, or organizational weaknesses. Your ear compensates for what your mouth does imperfectly. Recording and listening back is genuinely uncomfortable but produces faster improvement than any other single practice habit. Focus on three things each time: did I answer the question directly, did I fill the time, and was my speech continuous without long pauses.
  • 5 Do not rehearse topic-specific content The scoring engine is specifically designed to detect memorized content. Students who prepare ten prepared answers on different topics and then adapt them to whatever question comes up score lower than students who respond genuinely in the moment. The authenticity of spontaneous speech is a scorable signal. Prepare the structure and the habits — not the content.
🎙

Practice Take an Interview with AI-generated prompts

Hear the question. Record your 45-second response. See the scoring rubric. Compare to a model answer. Free to use — no account required.

Open Practice Tool →

How the Speaking section is scored

The Speaking section uses a three-layer scoring system. Each of the 11 items is scored 0 to 5. The 7 Listen and Repeat scores are averaged into a Listen and Repeat task score. The 4 Interview scores are averaged into an Interview task score. The final Speaking band is the average of the two task scores, converted to the 1.0 to 6.0 scale in 0.5 increments.

Speaking band score guide — what each level looks like
6.0
Near-perfect. Exact repetition on L&R. Interview responses fluent, precise, naturally organized. C2 level.
5.0
Strong. Mostly exact L&R. Interview responses clear, well-developed, fills 45 seconds. Minor errors only. C1 level.
4.5
Solid. Some L&R deviations. Interview responses organized but occasionally incomplete or imprecise. B2/C1 level.
4.0
Adequate. Noticeable L&R errors. Interview responses understandable but limited development or vocabulary range. B2 level.
3.5
Limited. Frequent L&R errors. Interview responses short or disorganized with pronunciation affecting intelligibility. B1/B2 level.

Source: ets.org/toefl/test-takers/ibt/scores.html


How to prepare for TOEFL 2026 Speaking

The daily practice routine that works

Speaking improvement requires daily practice — not weekly, not three times a week, but genuinely daily. The skills that TOEFL 2026 Speaking tests are motor skills as much as language skills. Fluency, pronunciation, and the ability to organize speech under time pressure are habits that degrade quickly without consistent use and improve steadily with consistent practice.

A 20-minute daily Speaking practice session structured as follows produces faster improvement than longer, less frequent sessions: five minutes of shadowing for Listen and Repeat preparation, ten minutes of timed Interview practice recording three 45-second responses on different topics, and five minutes listening back to identify specific areas to improve. That is it. Twenty minutes. Every day for eight to twelve weeks produces a measurable band improvement for the majority of students who commit to it.

Use the free Speaking practice tool

Our free Speaking practice tool at toefl.prepdrills.com/speaking-practice provides AI-generated audio prompts for both Listen and Repeat and Take an Interview, a recording interface, and the official ETS scoring rubric alongside a model answer for each prompt. It is the closest public approximation of the real test experience available at no cost, and it is specifically built for the 2026 format.

Take the free diagnostic to find your current Speaking band

Before you build any preparation plan, find out where your Speaking band currently sits. Our free TOEFL 2026 diagnostic at toefl.prepdrills.com covers all four sections including Speaking and gives you an honest starting point within 25 to 30 minutes.

Find your TOEFL Speaking level in 25 minutes — free

All four sections. Instant results. Know your current band before you build your plan.

Take it Free →

When to work with a teacher

Speaking is the section where teacher feedback adds the most value and is the hardest to replicate through self-study alone. The reason is simple: you cannot objectively hear your own speaking. Your brain fills in gaps, compensates for pronunciation errors, and interprets your own delivery more charitably than the scoring engine does.

A certified TOEFL teacher who knows the 2026 format can identify in a single session the specific patterns that are costing you points — whether it is a consistent hesitation before certain sounds, a tendency to underload your responses with content, a grammatical structure you overuse, or a pronunciation habit that affects intelligibility in ways you cannot hear yourself. For students targeting band 5.0 or above — the threshold for most exchange programs, MBA applications, and top university admissions in Spain, Italy, France, and beyond — at least a few sessions with an expert teacher dramatically accelerates progress.

Epic Exam Prep is based in Barcelona and has been preparing students for TOEFL Speaking since 2010, with one-to-one sessions and monthly group courses for students across Europe and beyond. Their teachers specialize in the 2026 format and provide the kind of targeted feedback that the app tool cannot replicate.

What 15 years of preparing students for TOEFL Speaking taught us The gap between a student's spoken English in conversation and their TOEFL Speaking score is almost always bigger than it should be. Students who communicate confidently and naturally in English consistently underperform on TOEFL Speaking because they have not adapted to the specific demands of the task — spontaneous organized speech under time pressure, recorded into a microphone, scored by an AI engine that weights fluency and organization as heavily as grammar. A student who sounds excellent in conversation but scores 4.0 on Speaking almost always has a fixable habit: they are pausing too long to think, underloading their responses, or opening without a direct answer. These are not language problems. They are task problems — and they respond quickly to the right kind of targeted practice.

Ready to improve your TOEFL Speaking?

Practice with real audio prompts, find your current band with a free diagnostic, and connect with expert teachers who know the 2026 format.

Try the Speaking Tool → Find a TOEFL Teacher