Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question/Feature request: Sync/merge contacts from a Google/iCloud .vcf export/backup #1056

Open
azbarcea opened this issue Mar 21, 2023 · 9 comments · May be fixed by #1064
Open

Question/Feature request: Sync/merge contacts from a Google/iCloud .vcf export/backup #1056

azbarcea opened this issue Mar 21, 2023 · 9 comments · May be fixed by #1064

Comments

@azbarcea
Copy link

azbarcea commented Mar 21, 2023

I'm looking to find out if import/sync can be done between a unified .vcf file obtained as export/backup from Google/iCloud/etc.

I presume the singlefile storage type was created for this, but using a config file: ~/.config/vdirsyncer/config:

[general]
status_path = "~/.vdirsyncer/status/"

[pair alex_contacts]
a = "contacts_local"
b = "google_backup"
collections = ["from a", "from b"]
metadata = ["color"]

[storage contacts_local]
type = "filesystem"
path = "~/.contacts/"
fileext = ".vcf"

[storage google_backup]
type = "singlefile"
path = "<path to google export of contacts file>.vcf"

Having output:

$ vdirsyncer -v DEBUG discover
Discovering collections for pair alex_contacts
contacts_local:
google_backup:
Saved for alex_contacts: collections = []

vdirsyncer version: 0.19.1
Server: iCloud/Google export/backup as .vcf
Python version: 3.10.10
OS: arch

@azbarcea
Copy link
Author

azbarcea commented Mar 21, 2023

Following docs: config.html?highlight=singlefile#google, and searching for CardDAV, I see Google Contacts CardDAV API and People API

Within that project, enable the “CalDAV” and “CardDAV” APIs (not the Calendar and Contacts APIs, those are different and won’t work). There should be a searchbox where you can just enter those terms.

Is presume the docs refer to Google Contacts CardDAV API, also known as contacts-api-migration. From its documentation:

The Contacts API was turned down on January 19, 2022. Use this guide to learn about changes to fields, endpoints, and authorization scopes as you migrate to the People API.

Is the vdirsyncer docs still accurate, by supporting Google contacts sync ?

@WhyNotHugo
Copy link
Member

We use the CardDav API: https://developers.google.com/people/carddav

I think that if you had configured something wrong you'd be getting an error, so I don't think you've picked the wrong choice.

Does vdirsyncer sync not sync anything? Have you tried commenting using collections = null?

@azbarcea
Copy link
Author

Thank you for your feedback. I'm going to reproduce the issue and post the errors I see.

As alternative to the Google People API CarDAV integration, when the integration doesn't work, I was proposing to support the alternative to sync from a backup file (export) in either .csv or .vcf format, or even for the scenario where one would avoid the hurdle of setting up the integration ... (just a thought)

@azbarcea
Copy link
Author

azbarcea commented Mar 30, 2023

  1. Setting up Google integration

  2. Running discover, it prompts me to the Google page in browser for authentication, after login it will redirect to http://127.0.0.1:38437/?state=(...) and display on the page: Successfully obtained token.. From terminal:

$ vdirsyncer discover
Discovering collections for pair contacts
contacts_local:
Opening https://accounts.google.com/o/oauth2/v2/auth?response_type=code&client_id=<obfuscated****>&redirect_uri=http%3A%2F%2F127.0.0.1%3A38437&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcalendar&state=YgPT0lTuGayACqEaXRclXSzfGm5MRP&access_type=offline&approval_prompt=force ...
Follow the instructions on the page.
"GET /?state=YgPT0lTuGayACqEaXRclXSzfGm5MRP&code=4/0AVHEtk57zeR2rrcAgiUAvIta4h6c9Ij9GKnfiagqQcvdCy7jalzT_ucovXDHuBAo6dFw4w&scope=https://www.googleapis.com/auth/calendar HTTP/1.1" 200 28
warning: Failed to discover collections for contacts_google, use `-vdebug` to see the full traceback.
error: Unknown error occurred: Not Found
error: Use `-vdebug` to see the full traceback.

Running with -vdebug:

$ vdirsyncer -vdebug discover
Discovering collections for pair contacts
contacts_local:
debug: ====================
debug: PROPFIND https://apidata.googleusercontent.com/caldav/v2/
debug: {'User-Agent': '<obfuscated****>', 'Content-Type': 'application/xml; charset=UTF-8', 'Depth': '1'}
debug: b'\n    <propfind xmlns="DAV:">\n        <prop>\n            <resourcetype />\n        </prop>\n    </propfind>\n    '
debug: Sending request...
debug: 403
debug: <CIMultiDictProxy('Content-Type': 'application/vnd.google.gdata.error+xml; charset=UTF-8', 'Content-Encoding': 'gzip', 'Date': 'Thu, 30 Mar 2023 21:20:31 GMT', 'Server': 'ESF', 'Cache-Control': 'private', 'X-XSS-Protection': '0', 'X-Frame-Options': 'SAMEORIGIN', 'X-Content-Type-Options': 'nosniff', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Transfer-Encoding': 'chunked')>
debug: <StreamReader 532 bytes>
debug: Given URL is not a homeset URL
debug: ====================
debug: PROPFIND https://apidata.googleusercontent.com/caldav/v2/
debug: {'User-Agent': '<obfuscated****>', 'Content-Type': 'application/xml; charset=UTF-8', 'Depth': '0'}
debug: b'\n        <propfind xmlns="DAV:">\n            <prop>\n                <current-user-principal />\n            </prop>\n        </propfind>\n        '
debug: Sending request...
debug: 403
debug: <CIMultiDictProxy('Content-Type': 'application/vnd.google.gdata.error+xml; charset=UTF-8', 'Content-Encoding': 'gzip', 'Date': 'Thu, 30 Mar 2023 21:20:31 GMT', 'Server': 'ESF', 'Cache-Control': 'private', 'X-XSS-Protection': '0', 'X-Frame-Options': 'SAMEORIGIN', 'X-Content-Type-Options': 'nosniff', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Transfer-Encoding': 'chunked')>
debug: <StreamReader 532 bytes eof>
debug: Trying out well-known URI
debug: ====================
debug: PROPFIND https://apidata.googleusercontent.com/.well-known/caldav
debug: {'User-Agent': '<obfuscated****>', 'Content-Type': 'application/xml; charset=UTF-8', 'Depth': '0'}
debug: b'\n        <propfind xmlns="DAV:">\n            <prop>\n                <current-user-principal />\n            </prop>\n        </propfind>\n        '
debug: Sending request...
debug: 404
debug: <CIMultiDictProxy('Content-Type': 'text/html; charset=UTF-8', 'Referrer-Policy': 'no-referrer', 'Content-Length': '1579', 'Date': 'Thu, 30 Mar 2023 21:20:31 GMT', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Connection': 'close')>
debug: <StreamReader 1147 bytes>
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 263, in _print_collections
debug:     discovered = await get_discovered()
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 176, in get_self
debug:     self._discovered = await self._discover()
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 185, in _discover
debug:     return handle_storage_init_error(self._cls, self._config)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 181, in _discover
debug:     discovered = await aiostream.stream.list(self._cls.discover(**self._config))
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/.venv/lib/python3.10/site-packages/aiostream/core.py", line 33, in wait_stream
debug:     async for item in streamer:
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/.venv/lib/python3.10/site-packages/aiostream/stream/aggregate.py", line 71, in list
debug:     async for item in streamer:
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 488, in discover
debug:     async for collection in d.discover():
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 274, in discover
debug:     for c in await self.find_collections():
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 235, in find_collections
debug:     self._find_collections_impl(await self.find_home())
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 209, in find_home
debug:     url = await self.find_principal()
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 174, in find_principal
debug:     return await self._find_principal_impl(self._well_known_uri)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 187, in _find_principal_impl
debug:     response = await self.session.request(
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/google.py", line 68, in request
debug:     return await super().request(method, path, **kwargs)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 416, in request
debug:     return await http.request(method, url, session=session, **more)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/http.py", line 155, in request
debug:     raise exceptions.NotFoundError(response.reason)
warning: Failed to discover collections for contacts_google, use `-vdebug` to see the full traceback.
debug: ====================
debug: PROPFIND https://apidata.googleusercontent.com/caldav/v2/
debug: {'User-Agent': '<obfuscated****>', 'Content-Type': 'application/xml; charset=UTF-8', 'Depth': '1'}
debug: b'\n    <propfind xmlns="DAV:">\n        <prop>\n            <resourcetype />\n        </prop>\n    </propfind>\n    '
debug: Sending request...
debug: 403
debug: <CIMultiDictProxy('Content-Type': 'application/vnd.google.gdata.error+xml; charset=UTF-8', 'Content-Encoding': 'gzip', 'Date': 'Thu, 30 Mar 2023 21:20:31 GMT', 'Server': 'ESF', 'Cache-Control': 'private', 'X-XSS-Protection': '0', 'X-Frame-Options': 'SAMEORIGIN', 'X-Content-Type-Options': 'nosniff', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Transfer-Encoding': 'chunked')>
debug: <StreamReader 532 bytes eof>
debug: Given URL is not a homeset URL
debug: ====================
debug: PROPFIND https://apidata.googleusercontent.com/caldav/v2/
debug: {'User-Agent': '<obfuscated****>', 'Content-Type': 'application/xml; charset=UTF-8', 'Depth': '0'}
debug: b'\n        <propfind xmlns="DAV:">\n            <prop>\n                <current-user-principal />\n            </prop>\n        </propfind>\n        '
debug: Sending request...
debug: 403
debug: <CIMultiDictProxy('Content-Type': 'application/vnd.google.gdata.error+xml; charset=UTF-8', 'Content-Encoding': 'gzip', 'Date': 'Thu, 30 Mar 2023 21:20:31 GMT', 'Server': 'ESF', 'Cache-Control': 'private', 'X-XSS-Protection': '0', 'X-Frame-Options': 'SAMEORIGIN', 'X-Content-Type-Options': 'nosniff', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Transfer-Encoding': 'chunked')>
debug: <StreamReader 532 bytes eof>
debug: Trying out well-known URI
debug: ====================
debug: PROPFIND https://apidata.googleusercontent.com/.well-known/caldav
debug: {'User-Agent': '<obfuscated****>', 'Content-Type': 'application/xml; charset=UTF-8', 'Depth': '0'}
debug: b'\n        <propfind xmlns="DAV:">\n            <prop>\n                <current-user-principal />\n            </prop>\n        </propfind>\n        '
debug: Sending request...
debug: 404
debug: <CIMultiDictProxy('Content-Type': 'text/html; charset=UTF-8', 'Referrer-Policy': 'no-referrer', 'Content-Length': '1579', 'Date': 'Thu, 30 Mar 2023 21:20:31 GMT', 'Alt-Svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'Connection': 'close')>
debug: <StreamReader 1579 bytes eof>
error: Unknown error occurred: Not Found
error: Use `-vdebug` to see the full traceback.
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/__init__.py", line 32, in inner
debug:     f(*a, **kw)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/__init__.py", line 221, in discover
debug:     asyncio.run(main())
debug:   File "/usr/lib/python3.10/asyncio/runners.py", line 44, in run
debug:     return loop.run_until_complete(main)
debug:   File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete
debug:     return future.result()
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/__init__.py", line 213, in main
debug:     await discover_collections(
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/tasks.py", line 92, in discover_collections
debug:     rv = await collections_for_pair(pair=pair, **kwargs)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 97, in collections_for_pair
debug:     rv = await aiostream.stream.list(
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/.venv/lib/python3.10/site-packages/aiostream/core.py", line 33, in wait_stream
debug:     async for item in streamer:
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/.venv/lib/python3.10/site-packages/aiostream/stream/aggregate.py", line 71, in list
debug:     async for item in streamer:
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 212, in expand_collections
debug:     collections = await get_b_discovered()
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 176, in get_self
debug:     self._discovered = await self._discover()
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 185, in _discover
debug:     return handle_storage_init_error(self._cls, self._config)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/cli/discover.py", line 181, in _discover
debug:     discovered = await aiostream.stream.list(self._cls.discover(**self._config))
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/.venv/lib/python3.10/site-packages/aiostream/core.py", line 33, in wait_stream
debug:     async for item in streamer:
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/.venv/lib/python3.10/site-packages/aiostream/stream/aggregate.py", line 71, in list
debug:     async for item in streamer:
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 488, in discover
debug:     async for collection in d.discover():
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 274, in discover
debug:     for c in await self.find_collections():
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 235, in find_collections
debug:     self._find_collections_impl(await self.find_home())
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 209, in find_home
debug:     url = await self.find_principal()
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 174, in find_principal
debug:     return await self._find_principal_impl(self._well_known_uri)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 187, in _find_principal_impl
debug:     response = await self.session.request(
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/google.py", line 68, in request
debug:     return await super().request(method, path, **kwargs)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/storage/dav.py", line 416, in request
debug:     return await http.request(method, url, session=session, **more)
debug:   File "/home/alex/projects/markdown-contacts/vdirsyncer/vdirsyncer/http.py", line 155, in request
debug:     raise exceptions.NotFoundError(response.reason)

This has been executed in a virtualenv:

$ pip freeze
aiohttp==3.8.4
aiohttp-oauthlib==0.1.0
aioresponses==0.7.4
aiosignal==1.3.1
aiostream==0.4.5
alabaster==0.7.13
async-timeout==4.0.2
atomicwrites==1.4.1
attrs==22.2.0
Babel==2.12.1
certifi==2022.12.7
cffi==1.15.1
cfgv==3.3.1
charset-normalizer==3.1.0
click==8.1.3
click-log==0.4.0
coverage==7.2.2
cryptography==39.0.2
distlib==0.3.6
docutils==0.18.1
exceptiongroup==1.1.1
filelock==3.10.0
frozenlist==1.3.3
hypothesis==6.70.0
identify==2.5.21
idna==3.4
imagesize==1.4.1
iniconfig==2.0.0
Jinja2==3.1.2
MarkupSafe==2.1.2
multidict==6.0.4
nodeenv==1.7.0
oauthlib==3.2.2
packaging==23.0
platformdirs==3.1.1
pluggy==1.0.0
pre-commit==3.2.0
pycparser==2.21
Pygments==2.14.0
pytest==7.2.2
pytest-asyncio==0.21.0
pytest-cov==4.0.0
pytest-httpserver==1.0.6
PyYAML==6.0
requests==2.28.2
requests-toolbelt==0.10.1
setuptools-scm==7.1.0
snowballstemmer==2.2.0
sortedcontainers==2.4.0
Sphinx==6.1.3
sphinx-rtd-theme==1.2.0
sphinxcontrib-applehelp==1.0.4
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.1
sphinxcontrib-jquery==4.1
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
tomli==2.0.1
trustme==0.9.0
typing_extensions==4.5.0
urllib3==1.26.15
-e git+ssh://git@github.com/pimutils/vdirsyncer.git@b1ef68089b11eed1050542108296b9a8d7dd0ae6#egg=vdirsyncer
virtualenv==20.21.0
Werkzeug==2.2.3
yarl==1.8.2

While ~/.config/vdirsyncer/config is:

[general]
status_path = "~/.vdirsyncer/status/"

[pair contacts]
a = "contacts_local"
b = "contacts_google"

collections = ["from a", "from b"]
metadata = ["color"]
conflict_resolution = "b wins"

[storage contacts_local]
type = "filesystem"
path = "~/.contacts/"
fileext = ".vcf"

[storage contacts_google]
type = "google_calendar"
token_file = "/home/alex/.config/vdirsyncer/apps.googleusercontent.com.token"
client_id = "<obfuscated****>"
client_secret = "<obfuscated****>"

@azbarcea
Copy link
Author

Trying to make it sync with Google Contacts didn't work for me. I'm focused on trying to make it work with singlefile option to use it as a source.

@azbarcea
Copy link
Author

Looks to me like the real issue for Google - Local storage to work is documented by #975.

As this issue is not currently fixed, I wonder what you think of the alternative to use the file backup as singlefile.

@azbarcea
Copy link
Author

azbarcea commented Apr 20, 2023

Does anyone knows what the following code in vdirsyncer/storage/singlefile.py is supposed to do?

        path = os.path.abspath(expand_path(path))
        try:
            path_glob = path % "*"
        except TypeError:
            # If not exactly one '%s' is present, we cannot discover
            # collections because we wouldn't know which name to assign.
            raise NotImplementedError

path will be obviously something like /home/user/path/to/latest/backup.vcf

@azbarcea azbarcea linked a pull request Apr 20, 2023 that will close this issue
@WhyNotHugo
Copy link
Member

@azbarcea A storage can have many collections:

  • A caldav storage has many calendar collections.
  • A filesystem storage can have multiple directories which are each a single calendar collection.
  • For singlefile the %s in the filename is replaced with the name id of the collection.

You can sync a caldav storage (aka account, which can have multiple calendars) to a singlefile storage (each calendar is one file).

@azbarcea
Copy link
Author

azbarcea commented Apr 21, 2023

Note: I was using singlefile and tested only with a contact collection.

In short, and maybe this is the real issue, on python3.v3.10.10, in singlefile.py, the line: path_glob = path % "*" always throws an Exception TypeError and will always fail, even if path is a valid os.path (e.g. /home/x/y/z/google.vcf). After troubleshooting, I realized that although path was valid, it wasn't extracting the filename as stem.

Going to your reply, I presume the purpose of this was to validate that path is a valid path. Responding inline:

@azbarcea A storage can have many collections:

  • A caldav storage has many calendar collections.

True. A singlefile contact storage (as I didn't tested with an .ics export, as I couldn't find an export from Google of something like that), as I see implemented, has only one collection, and by convention, the collection is the same with the name of the file.

  • A filesystem storage can have multiple directories which are each a single calendar collection.

The goal is indeed to sync the directory structure: ~/.contacts/{collection1, family, google, etc} with the google backup /home/x/y/z/google.vcf

  • For singlefile the %s in the filename is replaced with the name id of the collection.

You can sync a caldav storage (aka account, which can have multiple calendars) to a singlefile storage (each calendar is one file).

Now, maybe there is a problem with python3 glob implementation on v3.10.10, but the way was coded, it would not have replaced the name/id of the collection.

Now, with a configuration like:

[storage contacts_google_backup]
type = "singlefile"
path = "/home/x/y/z/google.vcf"

It will extract the stem of the path: /home/x/y/z/google.vcf, which is google (so it will ignore the extension, so it can match .vcf or .ics or just the name of the file without extension (google is just an example, nothing hardcoded).

For path_glob to be a glob type, instead of:

path_glob = path % "*"

maybe should have been:

path_glob = path.glob("*")

A more generic example to understand/play with glob:

from pathlib import Path

# Assume that path is a string representing a file path or pattern
path_str = "/path/to/my/files/*.txt"

# Convert path_str into a Path object
path = Path(path_str)

# Check if path is a valid path
if path.is_absolute() or path.is_relative_to(Path.cwd()):
    # Use glob to search for files that match the specified pattern
    for file_path in path.glob('*'):
        # Do something with each file_path here
        print(file_path)
else:
    print("Invalid path:", path_str)

As conclusion, the way the PR (#1064) is proposed, per my understanding, satisfies 100% the constraints/requirements you have mentioned above. Because test/storage/test_singlefile.py test is failing, I will be following as I understand more ...

Any guideline is much appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants