python 如何上传本地文件到hdfs

coghost 2015-10-14 04:03:38
问题:
如果在本机: 192.168.6.3测试
可以读取文件列表listdir
也可以创建目录mkdirs, 删除目录delete
但是无法上传文件到hdfs (copy_from_local)

但是同样代码在192.168.6.154中测试, 则完全没有问题.

环境:
python2.7
使用pyhdfs操作HDFS
本机IP 192.168.6.3
hadoop:
(namenode)
192.168.6.151
(datanode)
192.168.6.152
192.168.6.154(未使用)

其中151是虚拟机, 152,154等都是从151克隆过来的.



#!/usr/bin/env python
# encoding: utf-8

from pyhdfs import HdfsClient
client = HdfsClient(hosts='192.168.6.151:50070')
client.mkdirs("/user/lfp/001")
client.copy_from_local('/tmp/sm', '/tmp/lfp/001/', overwrite=True)
print(client.listdir('/user/lfp'))
client.delete("/user/aas/001")
print(client.listdir('/user/lfp'))



错误:

$ python test_hdfs.py
Traceback (most recent call last):
File "test_hdfs.py", line 7, in <module>
client.copy_from_local('/tmp/sm', '/tmp/lfp/001/', overwrite=True)
File "build/bdist.linux-x86_64/egg/pyhdfs.py", line 717, in copy_from_local
File "build/bdist.linux-x86_64/egg/pyhdfs.py", line 393, in create
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 99, in put
return request('put', url, data=data, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/api.py", line 44, in request
return session.request(method=method, url=url, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 455, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python2.7/dist-packages/requests/sessions.py", line 558, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python2.7/dist-packages/requests/adapters.py", line 378, in send
raise ConnectionError(e)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='hdp152', port=50075): Max retries exceeded with url: /webhdfs/v1/tmp/lfp/001/?op=CREATE&user.name=lfp&namenoderpcaddress=hdp151:9000&overwrite=true (Caused by <class 'socket.error'>: [Errno 110] Connection timed out)
...全文
1401 5 打赏 收藏 转发到动态 举报
AI 作业
写回复
用AI写文章
5 条回复
切换为时间正序
请发表友善的回复…
发表回复
qq_42498524 2019-10-28
  • 打赏
  • 举报
回复
SystemError: Failed to conncect to localhost:9000 连接失败
alinly 2016-03-07
  • 打赏
  • 举报
回复
难道已经存在? 没法覆盖: client.copy_from_local('/tmp/sm', '/tmp/lfp/001/', overwrite=True) ?
alinly 2016-03-07
  • 打赏
  • 举报
回复
引用 1 楼 qq_32940231 的回复:
求助一下这个问题 >>> import pyhdfs >>> fs = pyhdfs.connect('localhost',9000) Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration at java.net.URLClassLoader$1.run(URLClassLoader.java:372) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:360) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) Can't construct instance of class org.apache.hadoop.conf.Configuration Traceback (most recent call last): File "<stdin>", line 1, in <module> SystemError: Failed to conncect to localhost:9000
没有导入 环境变量? Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
花已伤完 2016-03-01
  • 打赏
  • 举报
回复
求助一下这个问题 >>> import pyhdfs >>> fs = pyhdfs.connect('localhost',9000) Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration at java.net.URLClassLoader$1.run(URLClassLoader.java:372) at java.net.URLClassLoader$1.run(URLClassLoader.java:361) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:360) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) Can't construct instance of class org.apache.hadoop.conf.Configuration Traceback (most recent call last): File "<stdin>", line 1, in <module> SystemError: Failed to conncect to localhost:9000

20,848

社区成员

发帖
与我相关
我的任务
社区描述
Hadoop生态大数据交流社区,致力于有Hadoop,hive,Spark,Hbase,Flink,ClickHouse,Kafka,数据仓库,大数据集群运维技术分享和交流等。致力于收集优质的博客
社区管理员
  • 分布式计算/Hadoop社区
  • 涤生大数据
加入社区
  • 近7日
  • 近30日
  • 至今
社区公告
暂无公告

试试用AI创作助手写篇文章吧