UltiSnips/utils/get_tm_snippets.py

#!/usr/bin/env python
# encoding: utf-8

import urllib
import re
from xml.etree import ElementTree
from xml.parsers.expat import ExpatError
import htmlentitydefs

_UNESCAPE = re.compile(ur'&\w+?;', re.UNICODE)
def unescape(s):
    if s is None:
        return ""
    def fixup(m):
        ent = m.group(0)[1:-1]
        print ent
        return unichr(htmlentitydefs.name2codepoint[ent])
    try:
        return _UNESCAPE.sub(fixup,s.decode("utf-8")).encode("utf-8")
    except:
        print repr(s)

def parse_content(c):
    try:
        data = ElementTree.fromstring(c)[0]

        rv = {}
        for k,v in zip(data[::2], data[1::2]):
            rv[k.text] = unescape(v.text)

        return rv
    except ExpatError:
        print "   Syntax Error"
        return None

def fetch_snippets(name):
    base_url = "http://svn.textmate.org/trunk/Bundles/" + name + ".tmbundle/"
    snippet_idx = base_url + "Snippets/"

    idx_list = urllib.urlopen(snippet_idx).read()


    rv = []
    for link in re.findall("<li>(.*?)</li>", idx_list):
        m = re.match(r'<a\s*href="(.*)"\s*>(.*)</a>', link)
        link, name = m.groups()
        if name == "..":
            continue

        name = unescape(name.rsplit('.', 1)[0]) # remove Extension
        print "Fetching data for Snippet '%s'" % name
        content = urllib.urlopen(snippet_idx + link).read()

        cont = parse_content(content)
        if cont:
            rv.append((name, cont))

    return rv


def write_snippets(snip_descr, f):

    for name, d in snip_descr:
        if "tabTrigger" not in d:
            continue

        f.write('snippet %s "%s"\n' % (d["tabTrigger"], name))
        f.write(d["content"].encode("utf-8") + "\n")
        f.write("endsnippet\n\n")


if __name__ == '__main__':
    import sys

    bundle = sys.argv[1]
    rv = fetch_snippets(bundle)
    write_snippets(rv, open("tm_" + bundle.lower() + ".snippets","w"))
Added a very basic fetching utility that fetches TextMate Bundle Snippets from the TextMate site and writes out a file that is compatible with us. Also added a new critical feature/bug that needs fixing: Tabstops in Default Text of Tabstops 2009-07-07 03:41:09 -04:00			`#!/usr/bin/env python`
			`# encoding: utf-8`

			`import urllib`
			`import re`
			`from xml.etree import ElementTree`
Beefed converter script a bit. Still an ugly hack. Imported HTML and CSS snippets from Textmate 2009-07-19 16:53:25 -04:00			`from xml.parsers.expat import ExpatError`
			`import htmlentitydefs`

			`_UNESCAPE = re.compile(ur'&\w+?;', re.UNICODE)`
			`def unescape(s):`
			`if s is None:`
			`return ""`
			`def fixup(m):`
			`ent = m.group(0)[1:-1]`
			`print ent`
			`return unichr(htmlentitydefs.name2codepoint[ent])`
			`try:`
			`return _UNESCAPE.sub(fixup,s.decode("utf-8")).encode("utf-8")`
			`except:`
Fixed a small bug while printing out non ascii chars 2009-07-30 14:03:56 -04:00			`print repr(s)`
Added a very basic fetching utility that fetches TextMate Bundle Snippets from the TextMate site and writes out a file that is compatible with us. Also added a new critical feature/bug that needs fixing: Tabstops in Default Text of Tabstops 2009-07-07 03:41:09 -04:00
			`def parse_content(c):`
Beefed converter script a bit. Still an ugly hack. Imported HTML and CSS snippets from Textmate 2009-07-19 16:53:25 -04:00			`try:`
			`data = ElementTree.fromstring(c)[0]`
Added a very basic fetching utility that fetches TextMate Bundle Snippets from the TextMate site and writes out a file that is compatible with us. Also added a new critical feature/bug that needs fixing: Tabstops in Default Text of Tabstops 2009-07-07 03:41:09 -04:00
Beefed converter script a bit. Still an ugly hack. Imported HTML and CSS snippets from Textmate 2009-07-19 16:53:25 -04:00			`rv = {}`
			`for k,v in zip(data[::2], data[1::2]):`
			`rv[k.text] = unescape(v.text)`
Added a very basic fetching utility that fetches TextMate Bundle Snippets from the TextMate site and writes out a file that is compatible with us. Also added a new critical feature/bug that needs fixing: Tabstops in Default Text of Tabstops 2009-07-07 03:41:09 -04:00
Beefed converter script a bit. Still an ugly hack. Imported HTML and CSS snippets from Textmate 2009-07-19 16:53:25 -04:00			`return rv`
			`except ExpatError:`
			`print " Syntax Error"`
			`return None`
Added a very basic fetching utility that fetches TextMate Bundle Snippets from the TextMate site and writes out a file that is compatible with us. Also added a new critical feature/bug that needs fixing: Tabstops in Default Text of Tabstops 2009-07-07 03:41:09 -04:00
			`def fetch_snippets(name):`
			`base_url = "http://svn.textmate.org/trunk/Bundles/" + name + ".tmbundle/"`
			`snippet_idx = base_url + "Snippets/"`

			`idx_list = urllib.urlopen(snippet_idx).read()`


			`rv = []`
			`for link in re.findall("<li>(.*?)</li>", idx_list):`
			`m = re.match(r'<a\shref="(.)"\s>(.)</a>', link)`
			`link, name = m.groups()`
			`if name == "..":`
			`continue`

Beefed converter script a bit. Still an ugly hack. Imported HTML and CSS snippets from Textmate 2009-07-19 16:53:25 -04:00			`name = unescape(name.rsplit('.', 1)[0]) # remove Extension`
Added a very basic fetching utility that fetches TextMate Bundle Snippets from the TextMate site and writes out a file that is compatible with us. Also added a new critical feature/bug that needs fixing: Tabstops in Default Text of Tabstops 2009-07-07 03:41:09 -04:00			`print "Fetching data for Snippet '%s'" % name`
			`content = urllib.urlopen(snippet_idx + link).read()`

Beefed converter script a bit. Still an ugly hack. Imported HTML and CSS snippets from Textmate 2009-07-19 16:53:25 -04:00			`cont = parse_content(content)`
			`if cont:`
			`rv.append((name, cont))`
Added a very basic fetching utility that fetches TextMate Bundle Snippets from the TextMate site and writes out a file that is compatible with us. Also added a new critical feature/bug that needs fixing: Tabstops in Default Text of Tabstops 2009-07-07 03:41:09 -04:00
			`return rv`


			`def write_snippets(snip_descr, f):`

			`for name, d in snip_descr:`
			`if "tabTrigger" not in d:`
			`continue`

			`f.write('snippet %s "%s"\n' % (d["tabTrigger"], name))`
Beefed converter script a bit. Still an ugly hack. Imported HTML and CSS snippets from Textmate 2009-07-19 16:53:25 -04:00			`f.write(d["content"].encode("utf-8") + "\n")`
Added a very basic fetching utility that fetches TextMate Bundle Snippets from the TextMate site and writes out a file that is compatible with us. Also added a new critical feature/bug that needs fixing: Tabstops in Default Text of Tabstops 2009-07-07 03:41:09 -04:00			`f.write("endsnippet\n\n")`



			`if __name__ == '__main__':`
			`import sys`

			`bundle = sys.argv[1]`
			`rv = fetch_snippets(bundle)`
			`write_snippets(rv, open("tm_" + bundle.lower() + ".snippets","w"))`