EDITED
This issue has been resolved from NLTK v3.2.1. Upgrading your NLTK version would resolve the issue, e.g. pip install -U nltk
.
I faced the same issue and the error encountered was as follows;
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:Python27libsite-packages
ltk-3.2-py2.7.egg
ltkag\__init__.py", line 110, in pos_tag
tagger = PerceptronTagger()
File "C:Python27libsite-packages
ltk-3.2-py2.7.egg
ltkagperceptron.py", line 141, in __init__
self.load(AP_MODEL_LOC)
File "C:Python27libsite-packages
ltk-3.2-py2.7.egg
ltkagperceptron.py", line 209, in load
self.model.weights, self.tagdict, self.classes = load(loc)
File "C:Python27libsite-packages
ltk-3.2-py2.7.egg
ltkdata.py", line 801, in load
opened_resource = _open(resource_url)
File "C:Python27libsite-packages
ltk-3.2-py2.7.egg
ltkdata.py", line 924, in _open
return urlopen(resource_url)
File "C:Python27liburllib2.py", line 126, in urlopen
return _opener.open(url, data, timeout)
File "C:Python27liburllib2.py", line 391, in open
response = self._open(req, data)
File "C:Python27liburllib2.py", line 414, in _open
'unknown_open', req)
File "C:Python27liburllib2.py", line 369, in _call_chain
result = func(*args)
File "C:Python27liburllib2.py", line 1206, in unknown_open
raise URLError('unknown url type: %s' % type)
urllib2.URLError: <urlopen error unknown url type: c>
The URLError that you mentioned was due to a bug in the perceptron.py file within the NLTK library for Windows.
In my machine, the file is at this location
C:Python27Libsite-packages
ltk-3.2-py2.7.egg
ltkagperceptron.py
(Basically look at an equivalent location within yours wherever you have the Python27 folder)
The bug was basically in the code to find the corresponding location for the averaged_perceptron_tagger within your machine. One can have a look at the line 801 and 924 mentioned in the data.py file regarding this.
I think the NLTK developer community recently fixed this bug in the code. Have a look at this commit made to their code a few days back.
https://github.com/nltk/nltk/commit/d3de14e58215beebdccc7b76c044109f6197d1d9#diff-26b258372e0d13c2543de8dbb1841252
The snippet where the change was made is as follows;
self.tagdict = {}
self.classes = set()
if load:
AP_MODEL_LOC = 'file:'+str(find('taggers/averaged_perceptron_tagger/'+PICKLE))
self.load(AP_MODEL_LOC)
# Initially it was:AP_MODEL_LOC = str(find('taggers/averaged_perceptron_tagger/'+PICKLE))
def tag(self, tokens):
Updating the file to the most recent commit worked for me and was able to use the nltk.pos_tag command. I believe this would resolve your problem as well (assuming you have everything else set up).