How to Use Google Nano Banana for Image Cutouts

Based on a practical Python example, this article explains how to use Google Nano Banana for product-image background removal while preserving the full source code.

This article uses a practical Python script to show how to call Google’s Nano Banana image editing capability for product-image cutouts.

The goal of this implementation is very clear:

  • Read product images from a directory
  • Call a Google image model to remove the background
  • Apply one more round of local transparent-background cleanup to the returned image
  • Export the final result as a transparent PNG

If you already have a batch of white-background product photos, headset images, or cable images and want to quickly generate transparent-background assets for e-commerce use, this approach is very direct.

What this code does

This script is mainly divided into 4 parts:

  1. Define a prompt so the model understands it should remove the background, keep the subject intact, and avoid adding shadows
  2. Call the google-genai image generation interface
  3. Extract the image result from the model response
  4. Use local logic to turn light-colored edge background areas transparent and reduce leftover halos

In other words, it does not simply send the image to the model and stop there. It combines model editing with local post-processing.

Before you run it

Install the dependencies first:

1
.\.venv\Scripts\python.exe -m pip install google-genai pillow

How to get GEMINI_API_KEY

GEMINI_API_KEY is the key used when calling the Gemini API. According to Google’s official quickstart, if you do not already have one, you can create it directly in Google AI Studio.

The process is straightforward:

  1. Open Google AI Studio.
  2. Sign in with your Google account.
  3. Find the Get API key or API keys page.
  4. Create a new API key.
  5. Copy the generated key.
  6. Configure it as a local environment variable for the script to read.

If there is no available project on the page yet, you typically need to finish project initialization first and then return to the API key page to create the key.

After you have the key, configure the environment variable:

1
$env:GEMINI_API_KEY="your_api_key"

If you are using cmd, you can write:

1
set GEMINI_API_KEY=your_api_key

If both GEMINI_API_KEY and GOOGLE_API_KEY are set, the runtime will usually prefer GOOGLE_API_KEY, so it is better to keep only one of them to avoid confusion.

Example directory structure

The script accepts two arguments:

  • input_dir: input image directory
  • output_dir: output image directory

For example:

1
2
3
4
5
images/
  product1.jpg
  product2.png

output/

How to run it

Assuming the script file is named cutout.py, run it like this:

1
.\.venv\Scripts\python.exe .\cutout.py .\images .\output

If you want to switch models, you can also pass it explicitly:

1
.\.venv\Scripts\python.exe .\cutout.py .\images .\output --model gemini-2.5-flash-image

The script will iterate over these file types in the input directory:

  • .jpg
  • .jpeg
  • .png
  • .webp

After processing, it will generate transparent-background PNG files with matching names in the output directory.

Core API call flow

The key code that actually calls Google Nano Banana is here:

1
2
3
4
response = client.models.generate_content(
    model=model,
    contents=[PROMPT, image],
)

Two pieces of content are passed in here:

  • A text prompt, PROMPT
  • A PIL.Image

The prompt asks the model to remove the full background from the product image, keep only the subject, and pay attention to a few important requirements:

  • Keep the full product intact
  • Preserve thin lines and cable details
  • Clean up inner holes and loop areas
  • Do not add new objects
  • Do not add shadows

Prompts like this have a big effect on cutout quality, especially for details such as earphone wires, transparent edges, and hollow regions.

Why local post-processing is still needed

After the model returns the result, the script does not save it directly. It also runs make_transparent_from_borders(image).

The idea behind this step is:

  • Start from the outer border of the image and find light-colored background pixels
  • Use breadth-first search to mark all connected light-colored regions
  • Convert those regions to transparent in one pass

The benefit is that it can further clean up leftover white edges, light gray backgrounds, and edge areas that are not clean enough.

The condition used to decide whether a pixel is background is here:

1
2
3
4
def is_light_background_pixel(r: int, g: int, b: int) -> bool:
    brightness = (r + g + b) / 3
    spread = max(r, g, b) - min(r, g, b)
    return brightness >= 170 and spread <= 35

In simple terms, this means:

  • The overall color must be bright enough
  • The RGB channel difference cannot be too large

This is especially suitable for product images with white, light gray, or near-solid-color backgrounds.

Full source code

The complete source code is preserved below so you can reuse or modify it directly:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
from __future__ import annotations

import argparse
import os
from pathlib import Path
from collections import deque

from PIL import Image

try:
    from google import genai
except ImportError as exc:  # pragma: no cover
    raise SystemExit(
        "Missing dependency: google-genai. Install it with "
        r"'.\.venv\Scripts\python.exe -m pip install google-genai'."
    ) from exc


PROMPT = (
    "Remove the entire background from this product photo and return only the product "
    "on a fully transparent background as a PNG. Keep the full product intact, preserve "
    "thin cable details, clean the inner loops and holes, and do not add any new objects "
    "or shadows."
)


def is_light_background_pixel(r: int, g: int, b: int) -> bool:
    brightness = (r + g + b) / 3
    spread = max(r, g, b) - min(r, g, b)
    return brightness >= 170 and spread <= 35


def to_pil_image(image_obj) -> Image.Image:
    if isinstance(image_obj, Image.Image):
        return image_obj
    pil_image = getattr(image_obj, "_pil_image", None)
    if isinstance(pil_image, Image.Image):
        return pil_image
    as_pil = getattr(image_obj, "pil_image", None)
    if isinstance(as_pil, Image.Image):
        return as_pil
    raise TypeError(f"Unsupported image object type: {type(image_obj)!r}")


def make_transparent_from_borders(image: Image.Image) -> Image.Image:
    rgba = image.convert("RGBA")
    width, height = rgba.size
    pixels = rgba.load()

    visited: set[tuple[int, int]] = set()
    queue: deque[tuple[int, int]] = deque()

    def push_if_bg(x: int, y: int) -> None:
        if (x, y) in visited:
            return
        r, g, b, _ = pixels[x, y]
        if is_light_background_pixel(r, g, b):
            visited.add((x, y))
            queue.append((x, y))

    for x in range(width):
        push_if_bg(x, 0)
        push_if_bg(x, height - 1)
    for y in range(height):
        push_if_bg(0, y)
        push_if_bg(width - 1, y)

    while queue:
        x, y = queue.popleft()
        for nx, ny in ((x - 1, y), (x + 1, y), (x, y - 1), (x, y + 1)):
            if 0 <= nx < width and 0 <= ny < height:
                push_if_bg(nx, ny)

    for x, y in visited:
        pixels[x, y] = (0, 0, 0, 0)

    return rgba


def save_first_image_part(response, dst: Path) -> None:
    parts = getattr(response, "parts", None)
    if parts is None and getattr(response, "candidates", None):
        parts = response.candidates[0].content.parts

    if not parts:
        raise RuntimeError("Model returned no content parts.")

    for part in parts:
        inline_data = getattr(part, "inline_data", None)
        if inline_data is None and isinstance(part, dict):
            inline_data = part.get("inline_data")

        if inline_data is None:
            continue

        if hasattr(part, "as_image"):
            image = to_pil_image(part.as_image())
            dst.parent.mkdir(parents=True, exist_ok=True)
            make_transparent_from_borders(image).save(dst)
            return

        data = getattr(inline_data, "data", None)
        mime_type = getattr(inline_data, "mime_type", "")
        if data:
            dst.parent.mkdir(parents=True, exist_ok=True)
            with open(dst, "wb") as handle:
                handle.write(data)
            with Image.open(dst) as img:
                processed = make_transparent_from_borders(img)
                processed.save(dst.with_suffix(".png"))
            if dst.suffix.lower() != ".png":
                dst.unlink(missing_ok=True)
            return

    raise RuntimeError("Model returned text only and no edited image.")


def process_image(src: Path, dst: Path, client, model: str) -> None:
    with Image.open(src).convert("RGBA") as image:
        response = client.models.generate_content(
            model=model,
            contents=[PROMPT, image],
        )
    save_first_image_part(response, dst)


def main() -> None:
    parser = argparse.ArgumentParser(description="Use Nano Banana / Gemini image editing to cut out product images.")
    parser.add_argument("input_dir", type=Path)
    parser.add_argument("output_dir", type=Path)
    parser.add_argument("--model", default="gemini-2.5-flash-image")
    args = parser.parse_args()

    api_key = os.environ.get("GEMINI_API_KEY")
    if not api_key:
        raise SystemExit("Missing GEMINI_API_KEY environment variable.")

    client = genai.Client(api_key=api_key)
    exts = {".jpg", ".jpeg", ".png", ".webp"}

    for src in sorted(args.input_dir.iterdir()):
        if not src.is_file() or src.suffix.lower() not in exts:
            continue
        dst = args.output_dir / f"{src.stem}.png"
        process_image(src, dst, client, args.model)
        print(dst)


if __name__ == "__main__":
    main()

Good places to improve it further

If you plan to use this script for batch production, you can continue improving it in these directions:

  • Add retry logic so one failed image does not stop the whole batch
  • Add logs to make it easier to identify which image failed
  • Make background thresholds configurable
  • Support recursive scanning of subdirectories
  • Add a side-by-side preview of the original and processed result

Summary

If you only want the shortest explanation of how to use Google Nano Banana for cutouts, the core process is just three steps:

  1. Install google-genai and Pillow
  2. Set GEMINI_API_KEY
  3. Pass the prompt and image to client.models.generate_content()

The value of this code is that it does more than just call the model. It also adds transparent-background post-processing, which makes it more suitable for direct product-image cutout work.

记录并分享
Built with Hugo
Theme Stack designed by Jimmy