Logo
WP Fix by Blimx

Japanese SEO Spam — Detection and Removal in WordPress

Actualizado:
SecurityMalware

What the Japanese keyword hack does

The Japanese SEO spam (also called Japanese keyword hack) is a variant of WordPress malware that creates thousands of low-quality auto-generated pages on your site, each targeting a Japanese keyword for affiliate revenue or counterfeit goods. The pages are invisible to you — only Google sees them indexed.

By the time you notice, Google Search Console shows you have 50,000+ pages indexed, your site appears in Japanese search results for terms like "ロレックス スーパーコピー" (replica Rolex), and your real pages drop in ranking because Google now sees your site as a spam farm.

This article is the complete detection and removal playbook.

How the hack operates

The attack pattern is consistent across infections:

1. Initial compromise via plugin vulnerability

Almost every Japanese hack we've cleaned in 2025-2026 started with a known plugin CVE — file manager plugins, contact forms, page builders. The attacker exploits the CVE to upload a webshell.

2. Webshell deploys persistence

The webshell creates a backdoor user, adds a malicious mu-plugin, and modifies .htaccess to enable URL rewriting.

3. Page generation begins

A scheduled task (cron) or hook into WordPress's init generates thousands of fake posts/pages. Each page targets a Japanese keyword and contains affiliate links.

4. Sitemap pollution

The malware adds these fake URLs to a hidden sitemap (sitemap-jp.xml, sitemap-extra.xml) or modifies your existing Yoast/RankMath sitemap to include them.

5. Search engine indexing

Google discovers the new URLs via the sitemap and indexes them. Within weeks, your site shows 10,000-100,000+ pages of Japanese spam in Google Search Console.

Detection

The Japanese hack hides from your browser but reveals itself to Google. Three reliable detection methods:

Method 1 — Google Search Console

Coverage report → Indexed Pages. If the count is wildly higher than your real page count (say 50,000 indexed when your site has 200 real pages), you're probably infected.

URL Inspection on a real page won't show the hack. But the bulk page count will give it away.

Method 2 — Google site: search

Search Google for:

site:yoursite.com 日本語

or

site:yoursite.com ロレックス

If results return with Japanese characters, the hack is indexed.

Method 3 — Curl as Googlebot

curl -A "Googlebot/2.1" https://yoursite.com/sitemap.xml | head -50
curl -A "Googlebot/2.1" https://yoursite.com/sitemap_index.xml | grep -i "jp\|japan"

If the sitemap returns URLs you don't recognize or contains entries for Japanese pages, you're infected.

Where the malware hides

We have cataloged the locations the Japanese hack uses:

Database

The fake posts often go into wp_posts with post_status = 'publish' and post_type = 'post' (or a custom type). They look like real posts in the database, just with Japanese content.

SELECT post_title, post_date FROM wp_posts
WHERE post_content REGEXP '[\p{Hiragana}\p{Katakana}\p{Han}]'
   OR post_title REGEXP '[\p{Hiragana}\p{Katakana}\p{Han}]'
ORDER BY post_date DESC LIMIT 100;

If MySQL doesn't support \p{...} Unicode ranges, search for specific byte ranges or known keywords:

SELECT post_title FROM wp_posts WHERE post_title LIKE '%ロレックス%' LIMIT 10;

Files

The malware payload is often in: - wp-content/mu-plugins/<random>.php — runs unconditionally, hard to remove via admin - wp-content/plugins/<innocent-looking>/init.php — pretends to be a legitimate plugin - wp-content/themes/<active>/inc/seo.php — hidden inside the theme - wp-content/uploads/2024/12/wp-cache.php — disguised as a cache file

Sitemaps

ls -la *.xml
cat wp-content/uploads/sitemap*.xml 2>/dev/null | head

Look for non-standard sitemap files. Yoast and RankMath generate dynamic sitemaps at known URLs; any static XML file in the root is suspicious.

Removal procedure

Step 1 — Take inventory before deleting

# Files modified in last 30 days
find /var/www/yoursite -name "*.php" -mtime -30 > /tmp/recent-php.txt
wc -l /tmp/recent-php.txt

This gives you the universe of files to investigate.

Step 2 — Identify and delete fake posts

-- Backup before deleting!
CREATE TABLE wp_posts_backup AS SELECT * FROM wp_posts WHERE post_title LIKE '%ロレックス%' OR post_title LIKE '%スーパーコピー%';

-- Then delete
DELETE FROM wp_posts WHERE post_title LIKE '%ロレックス%' OR post_title LIKE '%スーパーコピー%';

-- Clean up postmeta orphans
DELETE pm FROM wp_postmeta pm LEFT JOIN wp_posts p ON pm.post_id = p.ID WHERE p.ID IS NULL;

Step 3 — Remove malicious files

Each file you found in mu-plugins, suspicious plugins, theme injections — manually inspect and remove. If you're not sure whether a file is legitimate, save a copy and remove it; rebuild functionality afterward if needed.

Step 4 — Restore .htaccess

The hack typically writes URL-rewrite rules to .htaccess. Replace with the default:

# BEGIN WordPress
<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteRule ^index\.php$ - [L]
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule . /index.php [L]
</IfModule>
# END WordPress

Step 5 — Regenerate clean sitemap

Delete any non-standard sitemap files. Re-deploy your SEO plugin (Yoast/RankMath) to regenerate the dynamic sitemap.

Step 6 — Submit removal to Google

Google Search Console → Removals → New Request. Submit a removal request for the spam URLs. For a few hundred URLs, do batch removal. For thousands, request a Temporary Removal of the affected subfolders.

Also: Submit your real sitemap fresh so Google re-crawls properly.

Step 7 — Request review

Search Console → Security Issues → Request Review with description of what was found and cleaned. Google typically reviews within 72 hours.

Why this hack succeeds so often

The Japanese SEO hack stays undetected longer than other malware because:

  • It doesn't affect your homepage or known URLs
  • Logged-in admins never see the pages
  • Most security plugins focus on file-level malware signatures, missing database-resident attacks
  • Site owners check Google Analytics rather than Search Console; the spam doesn't get analytics tracking

If you only monitor your site visually, you won't see this hack until Google flags you.

Prevention

After cleanup:

  • File integrity monitoring with alerts for new files in wp-content/mu-plugins/
  • Daily check on Google Search Console for indexed page count anomalies
  • WAF rules to block the specific exploit attempts on file manager plugins
  • Removal of any file manager, backup-restore, or migration plugin not actively in use
  • Restricted upload directory: PHP execution disabled at server level

Common mistakes

  • Deleting fake posts but leaving the generator code — new spam appears within hours
  • Trusting "site looks clean from my browser" — the hack hides from you specifically
  • Not regenerating the sitemap — Google keeps re-indexing the spam URLs
  • Skipping the removal request — even with site cleaned, Google's index keeps the spam for weeks

When to call a specialist

A Japanese keyword hack with 10,000+ indexed spam URLs is a multi-step recovery. The technical cleanup is 4-8 hours; the Google recovery work (sitemap, removal requests, review) takes 2-3 weeks. We handle both.

Japanese hack emergency — typical resolution 6-12 hours of work spread over the recovery window. For broader malware see malware removal.