【求助】Windows下怎样快速查找重复文件？

DeeDee

整体结构如下，有什么简单一点的办法能够快速找出重复的文件吗？

-
|-- folder1
|   |-- aaaa.mp4
|   |-- bbbb.mp4
|   |-- cccc.mp4
|-- folder2
|   |-- aaaa.mp4
|   |-- dddd.mp4
|
|-- ...
|
|-- folderN
|   |-- jjjj.mp4
|   |-- kkkk.mp4

yeyeye

python版本
`import os
import hashlib
from collections import defaultdict

def file_hash(path, chunk_size=8192):
h = hashlib.md5()
with open(path, ‘rb’) as f:
while chunk := f.read(chunk_size):
h.update(chunk)
return h.hexdigest()

def find_duplicates(root_dir):
hash_map = defaultdict(list)
for dirpath, _, filenames in os.walk(root_dir):
for name in filenames:
path = os.path.join(dirpath, name)
try:
file_md5 = file_hash(path)
hash_map[file_md5].append(path)
except Exception as e:
print(f"Error reading {path}: {e}")

输出重复文件

for files in hash_map.values():
if len(files) > 1:
print(“\n[重复文件]”)
for f in files:
print(f)

find_duplicates(“C:\your\folder\path”)
`

或者powershell
powershell -command "Get-ChildItem -Recurse -File | ForEach-Object { $_ | Add-Member -NotePropertyName Hash -NotePropertyValue ((Get-FileHash $_.FullName -Algorithm MD5).Hash) -PassThru } | Group-Object Hash | Where-Object { $_.Count -gt 1 } | ForEach-Object { $_.Group | Select-Object FullName }"

或者cmd
`@echo off
setlocal ENABLEDELAYEDEXPANSION

:: 临时存放哈希值的文件
set hashlist=hashes.txt
del %hashlist% >nul 2>&1

echo 正在计算文件哈希值，请稍候…

for /R %%F in (.) do (
for /f “tokens=1” %%H in (‘certutil -hashfile “%%F” MD5 ^| findstr /R “^[0-9a-fA-F]”’) do (
echo %%H “%%F” >> %hashlist%
)
)

echo 查找重复文件：
sort %hashlist% | findstr /D /C:"" > sorted.txt

:: 输出重复项
for /f “tokens=1,* delims= ” %%A in (‘type sorted.txt’) do (
findstr /C:“%%A” sorted.txt | find /C /V "" > count.txt
set /p count= < count.txt
if !count! GTR 1 (
echo.
echo [重复文件：MD5=%%A]
findstr /C:“%%A” sorted.txt
)
)
del count.txt
`

DeeDee

#2 yeyeye 我有点蠢了，应该问AI的。。。能量给你算了
powershell

# 设置要搜索的根目录
$rootDir = "C:\Users\xx\Downloads\download\004_喜欢"  # 将这里替换为实际的目录

# 创建一个哈希表来存储文件名和文件路径列表
$fileNames = @{}

# 递归遍历目录
Get-ChildItem -Path $rootDir -File -Recurse | ForEach-Object {
    $fileName = $_.Name
    $filePath = $_.FullName

    # 检查文件名是否已存在
    if ($fileNames.ContainsKey($fileName)) {
        # 发现重复文件名，删除当前文件，保留第一个
        Write-Host "待删除重复文件: $filePath, 保留: $($fileNames[$fileName])"
        #try {
        #    Remove-Item -Path $filePath -Force
        #}
        #catch {
        #    Write-Host "删除文件 $filePath 失败: $($_.Exception.Message)" -ForegroundColor Red
        #}
    } else {
        # 将文件名和路径添加到哈希表
        $fileNames[$fileName] = $filePath
    }
}

Write-Host "完成！"

yeyeye

#3 DeeDee 哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈哈谢谢哥