What is the TOEFL 2026 Speaking section?

The TOEFL 2026 Speaking section has 11 scored items across two task types: Listen and Repeat (7 sentences) and Take an Interview (4 questions). The section takes approximately 8 to 10 minutes and has no preparation time for either task. Every response is scored by the ETS AI scoring engine on four dimensions: Fluency, Intelligibility, Language Use, and Organization. The final Speaking band is reported on the 1.0 to 6.0 scale aligned with CEFR levels.

How does the TOEFL 2026 Take an Interview task work?

In Take an Interview, you participate in a simulated online interview with a researcher. You answer four questions on one everyday topic such as city living, daily routines, or technology habits. Each question gives you 45 seconds to respond. There is no preparation time and no note-taking allowed. Questions progress from simple factual questions about your experience to broader opinion and evaluation questions. You hear each question once and must respond immediately.

How is TOEFL 2026 Speaking scored?

Each of the 11 Speaking responses is scored from 0 to 5 using the ETS scoring rubric. For Listen and Repeat, the rubric prioritizes Repeat Accuracy, Fluency, and Intelligibility. For Take an Interview, it evaluates Fluency, Intelligibility, Language Use, and Organization including Relevancy. The 7 Listen and Repeat scores are averaged into a task score and the 4 Interview scores are averaged into a task score. The final Speaking band (1.0 to 6.0) is the average of these two task scores, rounded to the nearest 0.5.

Can I use templates for TOEFL 2026 Speaking?

Memorised templates are not effective on the TOEFL 2026 Speaking section. The ETS AI scoring engine is designed to detect generic, rehearsed delivery and robotic phrasing. Both the AI system and human raters penalize responses that sound scripted. A simple flexible structure works far better: state your answer directly, give one or two reasons with a brief example, and wrap up in 40 to 45 seconds. The goal is natural, organized speech — not a memorized script.

What is Listen and Repeat in TOEFL 2026?

Listen and Repeat is the first task in the TOEFL 2026 Speaking section. You hear seven sentences about a campus or everyday situation, one at a time, and repeat each one after a beep. Sentences get progressively longer and more complex. According to the ETS scoring rubric, a perfect score of 5 requires the response to be fully intelligible and an exact repetition of the prompt. Self-correction is allowed. The key skills are auditory memory, pronunciation clarity, and rhythm — not memorized vocabulary.

What TOEFL 2026 Speaking band do I need for university?

Most competitive universities in the US and UK require a TOEFL 2026 Speaking band of 4.0 or above. Programs with teaching assistant requirements or strong oral communication components often require 4.5 or higher. A Speaking band of 5.0 is roughly equivalent to 26 to 28 on the old 0 to 30 section scale. Always verify the specific section minimum for your target program on the institution's official admissions page.

How do I prepare for TOEFL 2026 Speaking at home?

The most effective home preparation for TOEFL 2026 Speaking combines daily speaking practice with AI-powered feedback. Record yourself answering opinion questions for 45 seconds daily, focusing on maintaining a steady pace of 140 to 160 words per minute. Practice shadowing for Listen and Repeat — listen to short audio clips and repeat immediately after, matching rhythm and stress. Use a TOEFL speaking practice app that provides AI feedback on fluency, intelligibility, and language use. A good teacher who can give you real-time corrections on pronunciation and grammar will accelerate your progress significantly.

TOEFL 2026 Speaking: How to Ace Take an Interview (Complete Guide)

What changed: on January 21, 2026, ETS replaced the entire TOEFL Speaking section. The old four-task format — with its integrated passages, reading texts, and 15 to 30 seconds of preparation time — is gone. In its place are two completely new task types: Listen and Repeat, and Take an Interview. No templates. No prep time. No integrated content. Just you, speaking spontaneously, scored by AI in real time. This guide covers everything you need to know to score at band 5.0 and above.

The new format: what you are actually facing

The TOEFL 2026 Speaking section is the shortest and most practically focused section in the new exam. Total time: approximately 8 to 10 minutes. It comes last in the test, after Reading, Listening, and Writing. There are 11 scored items across two task types, and there is zero preparation time for either of them.

Task 1

Listen and Repeat

7 sentences, one at a time
8 to 12 seconds to respond after beep
Sentences get progressively longer
Topic: campus or everyday situation
Scored on: Accuracy, Fluency, Intelligibility
Goal: exact repetition, not paraphrasing

Task 2

Take an Interview

4 questions on one everyday topic
45 seconds per response
No preparation time at all
Questions progress from simple to opinion-based
Scored on: Fluency, Intelligibility, Language Use, Organization
Goal: natural, organized, spontaneous speech

According to the official ETS scoring framework, your final Speaking band is the average of two task scores — one for Listen and Repeat (averaging your 7 item scores) and one for Take an Interview (averaging your 4 item scores) — rounded to the nearest 0.5 on the 1.0 to 6.0 band scale. Neither task is weighted more heavily than the other in the final calculation, which means a weak performance on Listen and Repeat costs you just as much as a weak performance on the Interview.

Why this matters for your preparation Most students instinctively focus all their preparation time on the Interview because it feels more familiar — answering questions is something they have done before. But Listen and Repeat counts for exactly the same amount. Students who neglect it often plateau at band 4.0 even when their conversational English is strong. Balance your preparation between both tasks from day one.

Listen and Repeat: what it actually tests

Listen and Repeat is simpler in concept than the Interview, but it catches more students off guard. The task is straightforward: hear a sentence, wait for the beep, repeat it exactly. No paraphrasing, no summarising, no expressing your own ideas. The goal is precise auditory reproduction.

According to the official ETS rubric, a perfect score of 5 requires the response to be fully intelligible and an exact repetition of the prompt. A single meaningful error drops you to a 4. Missing content or changed meaning takes you to 2 or 3. The rubric rewards precision over creativity.

The seven sentences follow a progression in difficulty. The first two are short and simple — approximately 8 to 10 words. The middle three are medium length with more content words and clauses. The final two are the longest and most complex, often containing subordinate clauses, technical vocabulary, or sequences of steps. The long sentences are genuinely difficult, even for students with strong English. This is not a section to underestimate.

What actually trips students up

The most common error is not pronunciation — it is auditory memory. Students hear a sentence, process the meaning, and then speak from their understanding rather than from the precise words they heard. This works well in normal conversation but loses you points here, because the rubric scores repetition accuracy, not comprehension.

The fix is dedicated shadowing practice: listening to short audio clips and repeating immediately, word for word, matching the speaker's rhythm, stress, and intonation. Daily shadowing of 10 to 15 minutes builds the auditory memory muscle this task requires far more effectively than vocabulary study or pronunciation drills alone.

Self-correction is permitted. If you realise mid-sentence that you got a word wrong, go back and fix it. ETS allows this without penalty. If you completely lose a word, make your best guess and keep moving — stopping entirely and going silent costs more than an imperfect attempt.

Take an Interview: understanding the task

The Interview simulates a short online conversation with a researcher. You see the interviewer on screen in a short looping video, which creates a more natural conversational feel than reading a prompt from a page. The four questions all relate to one everyday topic — common themes include city life and daily routines, commuting and transportation, technology habits, personal preferences about learning or working, and opinions on social or community issues.

The questions follow a pattern that moves from concrete to abstract:

Question 1 typically asks about your current situation or personal experience. ("Do you currently live in a city, a small town, or a village?") This is the easiest question and your opportunity to settle in.

Question 2 asks about your habits or preferences related to the topic. ("How often do you use public transport?") Still factual, but starting to require more development.

Question 3 asks for your opinion or evaluation. ("Do you think cities are becoming easier or harder to live in?") This is where language complexity matters most.

Question 4 asks you to consider a broader perspective or hypothetical. ("What do you think governments should do to improve city life for young people?") The hardest question, requiring organized reasoning under time pressure with no preparation.

You hear each question once and must begin speaking immediately. There is no preparation time and no note-taking. You have 45 seconds per response.

How your Interview responses are scored

Every Interview response is scored on the 0 to 5 scale using four dimensions. Understanding these dimensions is the most important preparation step, because they tell you exactly what the ETS AI scoring engine is evaluating every time you speak.

Score	What it looks like
5	Fluent and clear throughout. Directly answers the question with a well-organized response. Varied vocabulary and grammar. Maintains natural pace of 140 to 160 words per minute. Fills close to the full 45 seconds.
4	Clear and relevant. Minor errors in grammar or vocabulary that do not obscure meaning. May lack connectors or full development. Generally good pace with small disruptions.
3	Understandable but choppy. Frequent pauses or reduced pace. Limited development of ideas. Some grammar errors that affect clarity. Relevancy is present but response may feel thin.
2	Significant fluency problems. Meaning is sometimes unclear. Limited range of vocabulary and grammar. Response does not fully address the question.
1	Largely unintelligible or very brief. Major problems with pronunciation, vocabulary, or grammar throughout. Response barely engages with the prompt.
0	No response or entirely off topic.

The four scoring constructs that generate this score are Fluency (steady pace, minimal unnatural pauses, smooth delivery), Intelligibility (clear pronunciation, word stress, and rhythm), Language Use (accurate and varied grammar and vocabulary), and Organization including Relevancy (directly answering the question with a clear, logical structure).

The biggest misconception about scoring Many students believe that using advanced vocabulary and complex grammar structures will earn a higher score. This is only partly true. The ETS AI scoring engine is specifically trained to penalize responses where advanced vocabulary is used incorrectly or where complex structures create clarity problems. A response using common words correctly, in a clear organized structure, at a natural pace, outscores a response full of impressive words used inaccurately. Clarity beats complexity every time.

The strategy that actually works

Because there is no preparation time, everything you do in the 45 seconds depends on what is already automatic in your spoken English. That is the core challenge of the TOEFL 2026 Speaking section — and the reason most students who try to prepare by memorizing templates fail to improve past a certain point.

Use a simple, flexible structure — not a memorized template

The most reliable structure for all four Interview questions: state your answer directly in the first 5 to 7 seconds, give one or two reasons with a brief example in the next 25 to 30 seconds, and wrap up in the final 5 seconds by restating your main point. This gives the AI scoring engine clear Organisation and Relevancy without sounding scripted. Adapt the language to each question rather than filling in blanks from a memorized script.

Target 140 to 160 words per minute

This is the speaking pace associated with natural, fluent English. Too slow sounds hesitant and scripted. Too fast creates intelligibility problems. Record yourself answering opinion questions and count your words. If you are consistently under 120 words per minute, pace is your primary target. If you are over 175, slow down. At 45 seconds per response, a band 5.0 answer is typically 100 to 120 words.

Answer the question that was asked, not the question you prepared for

Relevancy is a scored dimension. The AI scoring engine compares your response to the specific question asked and penalizes answers that drift off topic or answer a different question. When you hear each question, take half a second to identify exactly what is being asked before you start speaking. Question 4 in particular often surprises students because it shifts from personal experience to broader societal opinion — make sure you track that shift.

Keep moving — never stop mid-response

Silence is the most costly error in speaking. A long pause signals fluency problems to the scoring engine far more than a grammatical error does. If you lose your train of thought, use a cue phrase to buy yourself a moment: "Let me think about that for a second" or "Another way to look at it is..." These keep your speech moving while you organize your next idea. Imperfect but continuous speech scores better than perfect speech with silences.

Use transitions to demonstrate Organisation

The scoring engine evaluates Organisation partly through the presence of logical connectors. Phrases like "the main reason is," "for example," "additionally," and "so overall" signal structure to both the AI and human raters. You do not need elaborate academic transitions — simple, correctly used connectors consistently score better than sophisticated language used awkwardly. Build a small set of go-to transition phrases and use them naturally.

Fill the full 45 seconds

Very short responses almost never score above 3. The scoring rubric explicitly requires sufficient development of ideas, and a response that ends at 20 seconds does not give the engine enough data to score Language Use or Organisation reliably. If you have finished your main point with time remaining, extend your example, add a second reason, or connect your answer to a broader implication. Practice finishing close to 45 seconds consistently.

Sample question and response

Here is an example of what a Question 3 level Interview question looks like, and what a band 4.5 to 5.0 response sounds like in practice:

Sample Interview Question 3

"Do you think living in a big city makes it easier or harder to meet new people? Why?"

Personally, I think it makes it easier, even though it might not feel that way at first. The main reason is that cities offer so many different kinds of places where people naturally come together — classes, sports clubs, community events, coffee shops. You are constantly around people with different backgrounds, which makes conversations happen more organically. For example, in my experience, I have met people through a language exchange group that I never would have found in a smaller town. Of course, cities can feel anonymous too, but I think that is more about how you choose to use the space than about the city itself. So overall, I would say big cities actually give you more opportunities to connect, if you make the effort.

~43 seconds ~115 words Clear structure Natural pace

Notice what this response does: it answers the question directly in the first sentence, gives a reason with a concrete mechanism, adds a personal example, acknowledges the counterargument briefly, and wraps up. It uses common vocabulary correctly. It does not try to impress with difficult words. It fills close to the full 45 seconds.

The five most common mistakes

✗
Using memorized templates The AI scoring engine detects generic, rehearsed phrasing and robotic delivery. Templates produce lower scores than natural, adapted responses. Learn a flexible structure, not a fixed script.
✗
Neglecting Listen and Repeat Students treat this as easy and spend all their preparation time on the Interview. Listen and Repeat accounts for half the Speaking score. Shadowing practice is the most effective preparation and most students do almost none of it.
✗
Stopping when they lose their thread Silence is more damaging to your Fluency score than any grammatical error. Keep speaking, use a filler phrase, and find your way back to the topic. The engine needs continuous speech data to score you accurately and favorably.
✗
Finishing too early A 20-second response to a 45-second question is a structural problem, not a fluency problem. Practice extending your answers until finishing close to 45 seconds feels natural. Add a second example, a contrasting view, or a forward-looking statement to use your time.
✗
Practising reading, not speaking Reading TOEFL speaking guides and looking at sample responses does not build speaking ability. The only preparation that works is speaking — out loud, on a timer, with feedback. Record yourself. Listen back. Identify your specific patterns. Improve them.

How to prepare effectively

The TOEFL 2026 Speaking section rewards habits built over weeks, not knowledge acquired the night before. Here is how to structure your preparation.

Daily speaking practice (15 to 20 minutes)

Every day, answer five opinion questions on a timer. Use topics similar to Interview questions — city life, technology, education, work habits, social issues. Record yourself. Time your responses. Listen back and identify your specific weaknesses: Are you finishing too early? Are your transitions missing? Are you speaking too slowly? Are you drifting off topic on harder questions? The feedback loop matters as much as the practice itself.

Daily shadowing (10 to 15 minutes)

Pick any clear audio source — a news podcast, a short documentary clip, a TED Talk excerpt — and shadow it: listen and repeat immediately after the speaker, matching their rhythm, stress, and intonation word for word. This builds the auditory memory and natural rhythm that Listen and Repeat directly tests. Ten minutes of focused shadowing daily produces measurable improvement within two weeks.

AI-powered feedback

Practising alone is useful. Practising with feedback is dramatically more effective. Our TOEFL 2026 Speaking practice at toefl.prepdrills.com grades your Speaking responses using AI feedback on fluency, pronunciation, and content — so you know exactly what the scoring engine sees, not just how you feel about your own responses. It is available any time, free to start, and designed specifically for the 2026 format including both Listen and Repeat and Take an Interview tasks.

Work with a teacher for the final push

AI feedback catches the patterns your score reports reveal. A good Speaking teacher catches the patterns you cannot hear yourself — habitual pronunciation errors, logic gaps in your arguments, filler phrases you are not aware of using. For students targeting band 5.0 or above, at least a few sessions with a certified TOEFL teacher significantly accelerates the final stage of improvement. Epic Exam Prep offers one-to-one TOEFL Speaking preparation with teachers who specialize in exactly this section and know the 2026 rubrics in detail.

The honest truth about Speaking improvement Speaking is the hardest TOEFL section to improve alone. Reading, Writing, and Listening can all be developed through structured self-study. Speaking requires you to produce language under pressure, in real time, and evaluate it accurately — which is very difficult to do without external feedback. Students who plateau at band 3.5 or 4.0 despite consistent self-study almost always unlock improvement when they get real feedback, either from AI scoring or from a teacher who can hear what they cannot. Do not spend six weeks practicing the wrong things in private. Get feedback early.

What band do you need?

Most competitive universities in the US and UK require an overall TOEFL 2026 band of 4.5, which corresponds to a Speaking section band of approximately 4.0 or above. Programs that involve teaching, presenting, or significant oral communication — including teaching assistant roles, law, medicine, and some business programs — often set Speaking minimums of 4.5 or higher.

A Speaking band of 5.0 is roughly equivalent to 26 to 28 on the old 0 to 30 section scale. A band of 4.0 corresponds approximately to 22 to 24 on the old scale. Always verify the specific Speaking minimum for your target program, as institutions are still updating their requirements for the 2026 format.

Practice TOEFL 2026 Speaking with AI feedback

Both Listen and Repeat and Take an Interview — with AI grading on fluency, pronunciation, and content. Free to start.

Start Practising Free →

TOEFL 2026 Speaking: How to Ace Take an InterviewComplete Guide

The new format: what you are actually facing

Listen and Repeat

Take an Interview

Listen and Repeat: what it actually tests

What actually trips students up

Take an Interview: understanding the task

How your Interview responses are scored

The strategy that actually works

Use a simple, flexible structure — not a memorized template

Target 140 to 160 words per minute

Answer the question that was asked, not the question you prepared for

Keep moving — never stop mid-response

Use transitions to demonstrate Organisation

Fill the full 45 seconds

Sample question and response

The five most common mistakes

How to prepare effectively

Daily speaking practice (15 to 20 minutes)

Daily shadowing (10 to 15 minutes)

AI-powered feedback

Work with a teacher for the final push

What band do you need?

Practice TOEFL 2026 Speaking with AI feedback

TOEFL 2026 Speaking: How to Ace Take an Interview
Complete Guide