ocrmypdf pdf文件转可编辑文字
1.ubuntu/UOS下安装
sudo apt install ocrmypdf tesseract-ocr-chi-sim
2.转换
ocrmypdf -l eng+chi_sim --force-ocr input.pdf output.pdf
1.ubuntu/UOS下安装
sudo apt install ocrmypdf tesseract-ocr-chi-sim
2.转换
ocrmypdf -l eng+chi_sim --force-ocr input.pdf output.pdf
<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<!--宋体-->
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>\5B8B\4F53</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_SS_GB18030</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>simsun</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_SS_GB18030</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>宋体</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_SS_GB18030</string></edit>
</match>
<!--楷体-->
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>\6977\4F53</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_KT_GB18030</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>kaiti</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_KT_GB18030</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>楷体</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_KT_GB18030</string></edit>
</match>
<!--仿宋-->
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>\4EFF\5B8B</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_FS_GB18030</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>fangsong</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_FS_GB18030</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>仿宋</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_FS_GB18030</string></edit>
</match>
<!--黑体-->
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>\9ED1\4F53</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_HT_GB18030</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>heiti</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_HT_GB18030</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>黑体</string></test>
<edit name="family" mode="assign" binding="same"><string>CESI_HT_GB18030</string></edit>
</match>
<!--微软雅黑-->
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>\5FAE\8F6F\96C5\9ED1</string></test>
<edit name="family" mode="assign" binding="same"><string>Noto Sans CJK SC</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>Microsoft Yahei</string></test>
<edit name="family" mode="assign" binding="same"><string>Noto Sans CJK SC</string></edit>
</match>
<match target="pattern">
<test qual="any" name="family" compare="eq"><string>微软雅黑</string></test>
<edit name="family" mode="assign" binding="same"><string>Noto Sans CJK SC</string></edit>
</match>
</fontconfig>
首先要安装emoji字体,以noto fonts emoji为例
sudo pacman -S noto-fonts-emoji
Fontconfig 配置文件中,70-no-bitmaps.conf
作用是禁用位图字体。位图字体有时用作缺失字体的后备,这可能会导致文本呈现像素化或过大。在/etc/fonts/conf.d/
中保留该配置文件,则禁用位图字体。
sudo ln -s /usr/share/fontconfig/conf.avail/70-no-bitmaps.conf /etc/fonts/conf.d/70-no-bitmaps.conf
fontconfig有些时候把某些emoji字体也当做位图字体,所以使用该配置文件,会导致同时禁用emoji字体。如果对所有字体禁用了嵌入位图,则仍然可以为特定字体启用嵌入位图,以防没有嵌入位图而无法正常工作。例如,启用Noto emoji:
gedit 64-enable-emoji.conf
<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "urn:fontconfig:fonts.dtd">
<fontconfig>
<match target="font">
<edit name="embeddedbitmap" mode="assign">
<bool>false</bool>
</edit>
</match>
<match target="font">
<test name="family" qual="any">
<string>Noto Color Emoji</string>
</test>
<edit name="embeddedbitmap">
<bool>true</bool>
</edit>
</match>
</fontconfig>
/etc/fonts/conf.d/
。~/.config/fontconfig/conf.d/
。禁用位图字体的缩放通常会使位图字体变得模糊,删除 /etc/fonts/conf.d/10-scale-bitmap-fonts.conf
会解决。但是同时会破坏表情符号字体(如 Noto emoji表情符号)的缩放,使它们变得巨大。由于我们上面已禁用了其他位图字体,所以启用位图字体的缩放。
检查/etc/fonts/conf.d/
目录下是否有10-scale-bitmap-fonts.conf
配置文件,如果没有,新建软连接到此处。
sudo ln -s /usr/share/fontconfig/conf.default/10-scale-bitmap-fonts.conf /etc/fonts/conf.d/10-scale-bitmap-fonts.conf
fc-cache -fv
关闭需要显示emoji的应用,如浏览器、编辑器、终端等,重新打开,输入emoji表情符号即可看到效果。
1.每个文件都有一个唯一的inode号,获取文件的inode号
ls -i
2.使用find命令结合rm命令通过inode号删除文件。例如,要删除inode号为2236429
的文件或目录,可以使用以下命令
find -inum 2236429 -exec rm -rf {} \;
这种方法适用于删除单个文件或逐个删除乱码命名的文件。
<?php
// database information
$servername = "localhost";
$username = "username";
$password = "password";
// create connection
$conn = new mysqli($servername,$username,$password);
// check connection
if (!$conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
echo "connected successfully";
//close the connection
$conn->close();
?>
<?php
// database information
$servername = "localhost";
$username = "username";
$password = "password";
// create connection
$conn = mysqli_connect($servername,$username,$password);
// check connection
if (!$conn) {
die("Connection failed: " . $mysqli_connect_error());
}
echo "connected successfully";
//close the connection
mysqli_close($conn);
?>