springboot如何导入excel百万数据呢?
下文笔者讲述SpringBoot导入excel中百万数据的方法及示例分享,如下所示
导入百万数据的挑战
我们都知道任何程序,导入百万数据,都不是一件容易的事情 需我们着重关注性能和内存的使用情况,下文笔者将讲述具体的导入方法
导入百万数据的步骤
使用Apache POI的`SXSSFWorkbook`可有效处理大数据量的Excel文件 因为它基于流式处理 不会将整个文件加载到内存中 以下是一个详细的步骤 1.添加依赖: 在`pom.xml`中添加Apache POI依赖 2.创建控制器: 处理文件上传请求 3.创建服务: 处理Excel文件读取和数据处理 4.使用SXSSFWorkbook: 高效读取大数据量的Excel文件 5.批量插入数据库: 使用JPA或MyBatis等ORM框架批量插入数据 以提高性能 ===================================================================== 添加依赖 在`pom.xml`中添加以下依赖: <dependencies> <!-- Spring Boot Starter Web --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-web</artifactId> </dependency> <!-- Spring Boot Starter Data JPA --> <dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-data-jpa</artifactId> </dependency> <!-- H2 Database (for testing) --> <dependency> <groupId>com.h2database</groupId> <artifactId>h2</artifactId> <scope>Runtime</scope> </dependency> <!-- Apache POI for Excel processing --> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>5.2.3</version> </dependency> <!-- Apache POI for SXSSF (Streaming Usermodel API) --> <dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml-schemas</artifactId> <version>4.1.2</version> </dependency> <!-- XMLBeans for POI --> <dependency> <groupId>org.apache.xmlbeans</groupId> <artifactId>xmlbeans</artifactId> <version>5.1.1</version> </dependency> <!-- Commons Compress for POI --> <dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-compress</artifactId> <version>1.21</version> </dependency> </dependencies>
实体类创建
import javax.persistence.Entity; import javax.persistence.GeneratedValue; import javax.persistence.GenerationType; import javax.persistence.Id; @Entity public class UserData { @Id @GeneratedValue(strategy = GenerationType.IDENTITY) private Long id; private String name; private String email; private String phone; // Getters and Setters public Long getId() { return id; } public void setId(Long id) { this.id = id; } public String getName() { return name; } public void setName(String name) { this.name = name; } public String getEmail() { return email; } public void setEmail(String email) { this.email = email; } public String getPhone() { return phone; } public void setPhone(String phone) { this.phone = phone; } }
创建Repository接口
创建JPA Repository接口 处理数据库操作 import org.springframework.data.jpa.repository.JpaRepository; public interface UserDataRepository extends JpaRepository<UserData, Long> { // Custom query methods can be defined here if needed }
创建Controller控制器
创建一个控制器来处理文件上传请求 import org.springframework.beans.factory.annotation.Autowired; import org.springframework.web.bind.annotation.PostMapping; import org.springframework.web.bind.annotation.RequestParam; import org.springframework.web.bind.annotation.RestController; import org.springframework.web.multipart.MultipartFile; @RestController public class ExcelUploadController { @Autowired private ExcelService excelService; @PostMapping("/upload") public String uploadFile(@RequestParam("file") MultipartFile file) { if (file.isEmpty()) { return "Please select a file to upload."; } try { excelService.processExcelFile(file); return "File uploaded and processed successfully."; } catch (Exception e) { e.printStackTrace(); return "Failed to process the file."; } } }
创建服务类处理程序
创建一个服务类 来处理Excel文件的读取和数据处理 并批量插入数据库 import org.apache.poi.ss.usermodel.*; import org.apache.poi.xssf.streaming.SXSSFWorkbook; import org.apache.poi.xssf.usermodel.XSSFWorkbook; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Service; import org.springframework.web.multipart.MultipartFile; import javax.transaction.Transactional; import java.io.InputStream; import java.util.Arraylist; import java.util.Iterator; import java.util.List; @Service public class ExcelService { @Autowired private UserDataRepository userDataRepository; @Transactional public void processExcelFile(MultipartFile file) throws Exception { try (InputStream inputStream = file.getInputStream(); Workbook workbook = new SXSSFWorkbook(new XSSFWorkbook(inputStream), 100)) { //内存中只保留100行数据 Sheet sheet = workbook.getSheetAt(0); Iterator<Row> rowIterator = sheet.iterator(); // Skip header row if present if (rowIterator.hasNext()) { rowIterator.next(); } List<UserData> userDataList = new ArrayList<>(); int batchSize = 1000; // Define batch size int count = 0; while (rowIterator.hasNext()) { Row row = rowIterator.next(); Iterator<Cell> cellIterator = row.cellIterator(); UserData userData = new UserData(); int cellIndex = 0; while (cellIterator.hasNext()) { Cell cell = cellIterator.next(); switch (cellIndex) { case 0: userData.setName(cell.getStringCellValue()); break; case 1: userData.setEmail(cell.getStringCellValue()); break; case 2: userData.setPhone(cell.getStringCellValue()); break; default: break; } cellIndex++; } userDataList.add(userData); count++; // Save batch if (count % batchSize == 0) { userDataRepository.saveAll(userDataList); userDataList.clear(); } } // Save remaining data if (!userDataList.isEmpty()) { userDataRepository.saveAll(userDataList); } // Flush and dispose of temporary files ((SXSSFWorkbook) workbook).dispose(); } } }
采用以上程序可完美的插入百万数据 可避免内存溢出 批量插入优化 - 批量大小: 通过设置`batchSize`来控制每次批量插入的数据量 可根据实际情况调整 - 事务管理: 使用`@Transactional`注解确保批量 插入操作在一个事务中进行 提高数据一致性和性能
版权声明
本文仅代表作者观点,不代表本站立场。
本文系作者授权发表,未经许可,不得转载。