Artificial Intelligence 9 min read

Step-by-Step Guide to Using Baidu OCR API with Java

This article provides a comprehensive Java tutorial for accessing Baidu's OCR service, covering prerequisite setup, Maven dependencies, token acquisition, image-to‑Base64 conversion, HTTP request construction, and performance observations for Chinese, English, and mixed‑language image recognition.

Java Captain
Java Captain
Java Captain
Step-by-Step Guide to Using Baidu OCR API with Java

Documentation: The Baidu OCR API reference is available at http://ai.baidu.com/docs#/OCR-API/e1bd77f3 .

Preparation: The project uses Java 1.8 with Maven for dependency management. You must obtain an API_KEY and SECRET_KEY from the Baidu Developer Center by creating a "General Text Recognition" application; these are required to generate an access_token.

1. Maven dependencies (pom.xml):

<!-- https://mvnrepository.com/artifact/com.alibaba/fastjson -->
<dependency>
    <groupId>com.alibaba</groupId>
    <artifactId>fastjson</artifactId>
    <version>1.2.46</version>
</dependency>

<!-- https://mvnrepository.com/artifact/org.apache.httpcomponents/httpclient -->
<dependency>
    <groupId>org.apache.httpcomponents</groupId>
    <artifactId>httpclient</artifactId>
    <version>4.5.5</version>
</dependency>

2. Access token retrieval (AuthService.java):

package com.wsk.netty.check;

import org.json.JSONObject;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import java.util.Map;

public class AuthService {
    public static String getAuth() {
        String clientId = "**"; // API Key
        String clientSecret = "**"; // Secret Key
        return getAuth(clientId, clientSecret);
    }

    private static String getAuth(String ak, String sk) {
        String authHost = "https://aip.baidubce.com/oauth/2.0/token?";
        String getAccessTokenUrl = authHost
                + "grant_type=client_credentials"
                + "&client_id=" + ak
                + "&client_secret=" + sk;
        try {
            URL realUrl = new URL(getAccessTokenUrl);
            HttpURLConnection connection = (HttpURLConnection) realUrl.openConnection();
            connection.setRequestMethod("GET");
            connection.connect();
            BufferedReader in = new BufferedReader(new InputStreamReader(connection.getInputStream()));
            StringBuilder result = new StringBuilder();
            String line;
            while ((line = in.readLine()) != null) {
                result.append(line);
            }
            JSONObject jsonObject = new JSONObject(result.toString());
            return jsonObject.getString("access_token");
        } catch (Exception e) {
            e.printStackTrace();
        }
        return null;
    }

    public static void main(String[] args) {
        getAuth();
    }
}

3. Image to Base64 and URL‑encode utility (BaseImg64.java):

package com.wsk.netty.check;

import sun.misc.BASE64Encoder;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.net.URLEncoder;

public class BaseImg64 {
    public static String getImageStrFromPath(String imgPath) {
        InputStream in;
        byte[] data = null;
        try {
            in = new FileInputStream(imgPath);
            data = new byte[in.available()];
            in.read(data);
            in.close();
        } catch (IOException e) {
            e.printStackTrace();
        }
        BASE64Encoder encoder = new BASE64Encoder();
        return URLEncoder.encode(encoder.encode(data));
    }
}

4. OCR request implementation (Check.java):

package com.wsk.netty.check;

import org.apache.http.HttpResponse;
import org.apache.http.client.HttpClient;
import org.apache.http.client.methods.HttpPost;
import org.apache.http.entity.StringEntity;
import org.apache.http.impl.client.DefaultHttpClient;
import org.apache.http.util.EntityUtils;
import java.io.File;
import java.io.IOException;
import java.net.URI;
import java.net.URISyntaxException;

public class Check {
    private static final String POST_URL = "https://aip.baidubce.com/rest/2.0/ocr/v1/general_basic?access_token=" + AuthService.getAuth();

    public static String checkFile(String path) throws URISyntaxException, IOException {
        File file = new File(path);
        if (!file.exists()) {
            throw new NullPointerException("图片不存在");
        }
        String image = BaseImg64.getImageStrFromPath(path);
        String param = "image=" + image;
        return post(param);
    }

    public static String checkUrl(String url) throws IOException, URISyntaxException {
        String param = "url=" + url;
        return post(param);
    }

    private static String post(String param) throws URISyntaxException, IOException {
        HttpClient httpClient = new DefaultHttpClient();
        HttpPost post = new HttpPost();
        URI uri = new URI(POST_URL);
        post.setURI(uri);
        post.setHeader("Content-Type", "application/x-www-form-urlencoded");
        StringEntity entity = new StringEntity(param);
        post.setEntity(entity);
        HttpResponse response = httpClient.execute(post);
        if (response.getStatusLine().getStatusCode() == 200) {
            try {
                String str = EntityUtils.toString(response.getEntity());
                return str;
            } catch (Exception e) {
                e.printStackTrace();
                return null;
            }
        }
        return null;
    }

    public static void main(String[] args) {
        String path = "E:\\find.png";
        try {
            long now = System.currentTimeMillis();
            checkFile(path);
            checkUrl("https://gss3.bdstatic.com/-Po3dSag_xI4khGkpoWK1HF6hhy/baike/c0%3Dbaike80%2C5%2C5%2C80%2C26/sign=08c05c0e8444ebf8797c6c6db890bc4f/fc1f4134970a304e46bfc5f7d2c8a786c9175c19.jpg");
            System.out.println("耗时:" + (System.currentTimeMillis() - now) / 1000 + "s");
        } catch (URISyntaxException | IOException e) {
            e.printStackTrace();
        }
    }
}

5. Test results: The author tested three scenarios – Chinese‑only, English‑only, and mixed‑language images – using both local files and remote URLs. Screenshots (omitted here) show that recognition completes in about one second with high accuracy, though some characters may be missed in longer Chinese strings.

Conclusion: Baidu's OCR API, when accessed via Java, provides fast (<1 s) and reasonably accurate text extraction for both Chinese and English images. The provided utility classes simplify token handling, image encoding, and HTTP communication, making integration straightforward for Java developers.

JavaHTTPAPIImage RecognitionBase64access tokenBaidu OCR
Java Captain
Written by

Java Captain

Focused on Java technologies: SSM, the Spring ecosystem, microservices, MySQL, MyCat, clustering, distributed systems, middleware, Linux, networking, multithreading; occasionally covers DevOps tools like Jenkins, Nexus, Docker, ELK; shares practical tech insights and is dedicated to full‑stack Java development.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.