精品专区-精品自拍9-精品自拍三级乱伦-精品自拍视频-精品自拍视频曝光-精品自拍小视频

網(wǎng)站建設(shè)資訊

NEWS

網(wǎng)站建設(shè)資訊

java+dom4j.jar提取xml文檔內(nèi)容

本文實(shí)例為大家分享了java + dom4j.jar提取xml文檔內(nèi)容的具體代碼,供大家參考,具體內(nèi)容如下

旬陽(yáng)網(wǎng)站制作公司哪家好,找創(chuàng)新互聯(lián)建站!從網(wǎng)頁(yè)設(shè)計(jì)、網(wǎng)站建設(shè)、微信開(kāi)發(fā)、APP開(kāi)發(fā)、成都響應(yīng)式網(wǎng)站建設(shè)等網(wǎng)站項(xiàng)目制作,到程序開(kāi)發(fā),運(yùn)營(yíng)維護(hù)。創(chuàng)新互聯(lián)建站公司2013年成立到現(xiàn)在10年的時(shí)間,我們擁有了豐富的建站經(jīng)驗(yàn)和運(yùn)維經(jīng)驗(yàn),來(lái)保證我們的工作的順利進(jìn)行。專(zhuān)注于網(wǎng)站建設(shè)就選創(chuàng)新互聯(lián)建站

資源下載頁(yè):點(diǎn)擊下載

本例程主要借助幾個(gè)遍歷的操作對(duì)xml格式下的內(nèi)容進(jìn)行提取,操作不是最優(yōu)的方法,主要是練習(xí)使用幾個(gè)遍歷操作。

xml格式文檔內(nèi)容:

<?xml version="1.0" encoding="UTF-8"?> 
 
-
 
 
-
 
An End to Nuclear Testing
 
 
 
 
 
 
 
 


 
 
-
 

 

 
 
-
 
ATOMIC WEAPONS 
NUCLEAR TESTS 
TESTS AND TESTING 
EDITORIALS 
CLINTON, BILL (PRES) 
Editorial 
Top/Opinion 
Top/Opinion/Opinion 
Top/Opinion/Opinion/Editorials 
Nuclear Tests 
Atomic Weapons 
Tests and Testing 
Armament, Defense and Military Forces
 
 
 

 

 
 
-
 
 
-
 
 
-
 
An End to Nuclear Testing
 
 

 
 
-
 
 
-
 

For nearly half a century, test explosions in the Nevada desert were a reverberating reminder of cold war insecurity. Now the biggest worry is nuclear proliferation, not the Soviet threat. That's why President Clinton has quietly decided to extend the moratorium on tests of nuclear arms for at least 15 months.

To persuade nuclear have-nots to stay out of the bomb-making business, it makes more sense to halt testing and try to get others to do likewise than to conduct more demonstrations of America's deterrent power.

-

For nearly half a century, test explosions in the Nevada desert were a reverberating reminder of cold war insecurity. Now the biggest worry is nuclear proliferation, not the Soviet threat. That's why President Clinton has quietly decided to extend the moratorium on tests of nuclear arms for at least 15 months.

To persuade nuclear have-nots to stay out of the bomb-making business, it makes more sense to halt testing and try to get others to do likewise than to conduct more demonstrations of America's deterrent power.

Not that nuclear wannabes will necessarily follow America's lead. Nor will an end to all testing assure an end to bomb-making; states like Pakistan have developed nuclear devices without testing them first.

But calling a halt to U.S. nuclear testing makes it easier for leaders in Russia and France to extend the moratoriums they are now observing and improve the atmosphere for prompt negotiation of a treaty to ban all tests.

That test ban in turn should shore up international support for the 1968 Nonproliferation Treaty, linchpin of efforts to stop the spread of nuclear arms, when it comes up for review in 1995. It will also bolster the backing for tighter controls on exports used in bomb-making.

Mr. Clinton has taken three helpful steps. He has extended the Congressionally mandated moratorium on U.S. tests that was due to expire last week. He has declared that the U.S. will not test unless another nation does so first. And he wants to negotiate a total ban on testing.

But the President also wants the nuclear labs to be prepared for a prompt resumption of warhead safety and reliability tests. This could cost millions of dollars and doesn't make much sense, since in Mr. Clinton's own words, "After a thorough review, my Administration has determined that the nuclear weapons in the United States' arsenal are safe and reliable."

Moreover, preparations for testing can take on a life of their own: 30 years after the Limited Test Ban Treaty put an end to above-ground tests, the U.S. still spends $20 million a year on Safeguard C, a program to keep test sites ready.

American security no longer rests on that sort of eternal nuclear vigilance. Mr. Clinton's moratorium may make America safer than all the tests and preparations for tests that the nuclear labs can dream up.

提取代碼:

對(duì)多文件進(jìn)行操作,首先遍歷所有文件路徑,存到遍歷器中,然后對(duì)遍歷器中的文件路徑進(jìn)行逐一操作。

package com.njupt.ymh;
 
import java.io.File;
import java.util.ArrayList;
import java.util.List;
 
import edu.princeton.cs.algs4.In;
 
/**
 * 返回文件名列表
 * @author 11860
 *
 */
public class SearchFile {
 
 public static List getAllFile(String directoryPath,boolean isAddDirectory) {
  List list = new ArrayList(); // 存放文件路徑
  File baseFile = new File(directoryPath); // 當(dāng)前路徑
  
  if (baseFile.isFile() || !baseFile.exists()) 
   return list;
  
  File[] files = baseFile.listFiles(); // 子文件
  for (File file : files) {
   if (file.isDirectory()) 
   { 
    if(isAddDirectory) // isAddDirectory 是否將子文件夾的路徑也添加到list集合中
     list.add(file.getAbsolutePath()); // 全路徑
    
    list.addAll(getAllFile(file.getAbsolutePath(),isAddDirectory));
   } 
   else 
   {
    list.add(file.getAbsolutePath());
   }
  }
  return list;
 }
 public static void main(String[] args) {
 
 //SearchFile sFile = new SearchFile();
 List listFile = SearchFile.getAllFile("E:\\huadai", false);
 System.out.println(listFile.size());
 File file = new File(listFile.get(3));
 In in = new In(listFile.get(4));
 while (in.hasNextLine()) {
 String readLine = in.readLine().trim(); // 讀取當(dāng)前行
 System.out.println(readLine);
 
 }
 System.out.println(file.length());
 
 }
 
}
package com.njupt.ymh;
 
import java.io.File;
import java.util.Iterator;
import java.util.List;
 
import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.Element;
import org.dom4j.Node;
import org.dom4j.io.SAXReader;
 
public class NewsPaper {
 int doc_id; // 文章id
 String doc_title; // 文章標(biāo)題
 String lead_paragraph ; // 文章首段
 String full_text; // 文章內(nèi)容
 String date; // 文章日期
 public NewsPaper(String xml) {
 doc_id = -1; // 文章id
 doc_title = null; // 文章標(biāo)題
 lead_paragraph = null; // 文章首段
 full_text = null; // 文章內(nèi)容
 date = null; // 文章日期
 searchValue(xml);
 }
 
 /**
 * 加載Document文件
 * @param fileName
 * @return Document
 */
 private Document load(String fileName) {
 Document document = null; // 文檔
 SAXReader saxReader = new SAXReader(); // 讀取文件流
 
 try {
 document = saxReader.read(new File(fileName));
 } catch (DocumentException e) {
 e.printStackTrace();
 }
 
 return document;
 }
 
 /**
 * 獲取Document的根節(jié)點(diǎn)
 * @param args
 */
 private Element getRootNode(Document document) {
 return document.getRootElement();
 }
 
 /**
 * 獲取所需節(jié)點(diǎn)值
 * @param xml
 */
 private void searchValue(String xml) {
 Document document = load(xml);
  Element root = getRootNode(document); // 根節(jié)點(diǎn) 
  
  // 文章日期
  date = xml.substring(10, 20);
  // 文章標(biāo)題
  doc_title = root.valueOf("http://head/title");
  
  // 文章-id
  List list_doc_id = document.selectNodes("http://doc-id/@id-string"); 
  for(Node ele:list_doc_id){
   doc_id = Integer.parseInt(ele.getText());
  }
  
  // 文章內(nèi)容
  for (Iterator i = root.elementIterator(); i.hasNext();) { 
   Element el = (Element) i.next(); // head、body
   
   // 對(duì)body節(jié)點(diǎn)進(jìn)行操作
   if (el.getName() == "body") { // body
    for (Iterator body = el.elementIterator(); body.hasNext();) {
  Element elbody = body.next();
  
  if (elbody.getName() == "body.content") { //body.content
  for (Iterator block = elbody.elementIterator(); block.hasNext();) {
  Element block_class = (Element) block.next();
  
  if (block_class.attributeValue("class").equals("full_text") ) { // full_text
  List list_text = block_class.selectNodes("p");
  for (Node text : list_text) 
   if (full_text == null) 
   full_text = text.getStringValue();
   else 
   full_text = full_text +" " + text.getStringValue();
  }
  
  else { // lead_paragraph
  List list_lead = block_class.selectNodes("p");
  for (Node lead : list_lead) 
   if (lead_paragraph == null)
   lead_paragraph = lead.getStringValue();
   else 
   lead_paragraph = lead_paragraph +" "+ lead.getStringValue();
  }
  }
  }
 }
   }
  } 
 }
 
 /**
 * 獲取文章標(biāo)題
 * @param args
 */
 public String getTitle() {
 return doc_title;
 }
 
 /**
 * 獲取文章id
 * @param args
 */
 public int getID() {
 return doc_id;
 }
 
 /**
 * 獲取文章簡(jiǎn)介
 * @param args
 */
 public String getLead() {
 if (getID() < 394070 && lead_paragraph != null && lead_paragraph.length() > 6)  //1990-10-22之前
 return lead_paragraph.substring(6);
 else       //1990-10-22之后
 return lead_paragraph;
 }
 
 /**
 * 獲取文章正文
 * @param args
 */
 public String getfull() {
 if (getID() < 394070 && full_text != null && full_text.length() > 6)   //1990-10-22之前
 return full_text.substring(6);
 else
 return full_text;
 }
 
 /**
 * 獲取文章日期
 * @param args
 */
 public String getDate() {
 return date;
 }
 
 /**
 * 判斷獲取的信息是否有用
 * @return
 */
 public boolean isUseful() {
 if (getID() == -1)
 return false;
 if (getDate() == null ) 
 return false;
 if (getTitle() == null || getTitle().length() >= 255) 
 return false;
 if (getLead() == null || getLead().length() >= 65535 ) 
 return false;
 if (getfull() == null || getfull().length() >= 65535) 
 return false;
 
 return !isnum();
 }
 
 /**
 * 挑出具有特殊開(kāi)頭的數(shù)字內(nèi)容文章
 * @return
 */
 private boolean isnum() {
 if (getfull() != null && getfull().length() > 24) {
 if (getfull().substring(0, 20).contains("*3*** COMPANY REPORT") ) { // 剔除數(shù)字文章 
 return true;
 }
 }
 return false;
 }
 
 
 public static void main(String[] args) {
 List listFile = SearchFile.getAllFile("E:\\huadai\\1989\\10", false); // 文件列表
 //String date; // 日期
 int count = 0;
 int i = 0;
 for (String string : listFile) {
 NewsPaper newsPaper = new NewsPaper(string);
 count++;
 if (!newsPaper.isUseful()) {
 i++;
 System.out.println(newsPaper.getLead());
 } 
 }
 
 System.out.println(i + " "+ count);
 
 }
}

 以上就是本文的全部?jī)?nèi)容,希望對(duì)大家的學(xué)習(xí)有所幫助,也希望大家多多支持創(chuàng)新互聯(lián)。


分享題目:java+dom4j.jar提取xml文檔內(nèi)容
文章來(lái)源:http://m.jcarcd.cn/article/gccsee.html
主站蜘蛛池模板: 国内精自线i | 福利影视大全 | 蜜桃视频专区 | 国产二三| 国产成网站18 | 91精品秘入口观看 | 欧洲精品亚洲一区 | 91免费在线观看 | 国产亚洲免费视频 | 欧美一级夜夜 | 欧美在线综合 | 日本三级网站网址 | 无码帝国www无码专 无码电影免费黄网站 | 午夜最污视 | 日韩伦理在线播放成 | 国产精品丝袜黑 | 91免费观看网站 | 九九视频免费在线观 | 韩国午夜理伦 | 福利所视频导航 | 国产狼人视频 | 国产福利一区二 | 精品成人一区二区 | 乱码伦视频免费 | 噼里啪啦影院大 | 69精品人人 | 另类专区 | 国偷自产a | 乱辈通轩系列小 | 韩国不卡午夜 | 91色色| 精品国产品国语 | 午夜国产理论 | 国产福利免费观看v | 九一果冻制作厂在线 | 无码av免费一区二区三区 | 欧美日韩大片在 | 成人伦理动 | 精品国产欧 | 无码精品国产第一区二区 | 国产人妖|