Skip Navigation
InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)IN
InternetPirate @lemmy.fmhy.ml
Posts 78
Comments 186
ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml

Tingling and palpitations spreading throughout the body

The tingling and palpitations that have been spreading throughout the body can be caused by several conditions. Here are some possibilities based on the search results:

  • Episodic Neurologic Symptoms: Patients with this condition usually complain of severe apprehension, dizziness, chest discomfort, palpitations, or fear of "losing control." The onset is usually under age 40[1].

  • Heart Attack or Panic Attack: Both conditions can cause chest pain, shortness of breath, dizziness, sweating, passing out, tingling, or a sensation of impending doom. These shared symptoms of heart and panic attack may also be caused by other serious conditions such as blood clots, lung infection or collapse, or tear in the large vessels of the chest for patients with certain pre-existing risk factors[2].

  • Atrial Fibrillation: The most obvious symptom of this condition is heart palpitations, where the heart feels like it's pounding, fluttering, or beating irregularly, often for a few seconds or possibly a few minutes. Other symptoms include tiredness, feeling lethargic, chest pain, dizziness, and shortness of breath[3][6].

  • Peripheral Neuropathy: Symptoms of this condition include numbness and tingling in the feet or hands, burning, stabbing or shooting pain in affected areas, loss of balance and coordination, and muscle weakness, especially in the feet. These symptoms are usually constant but may come and go[4].

It's important to see a doctor if you experience any of these symptoms to determine the underlying cause and receive appropriate treatment.

Citations: [1] https://www.ncbi.nlm.nih.gov/books/NBK374/ [2] https://www.cedars-sinai.org/blog/is-it-a-heart-attack-or-a-panic-attack.html [3] https://www.nhs.uk/conditions/atrial-fibrillation/symptoms/ [4] https://www.nhsinform.scot/illnesses-and-conditions/brain-nerves-and-spinal-cord/peripheral-neuropathy [5] https://www.webmd.com/heart-disease/heart-palpitations-emergency [6] https://www.hopkinsmedicine.org/health/conditions-and-diseases/atrial-fibrillation/afib-symptoms

0
How to download a website for offline browsing using wget?
  • There are several Linux command-line tools that can be used to download a mirror of a website for offline browsing. Here are some of the most popular ones:

    1. wget: wget is a free utility for non-interactive download of files from the Web. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. To download a website recursively, use the -r option. For example, the following command will download the entire website located at http://example.com:

      wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.com
      

      This will create a local copy of the website in a directory named example.com.

    2. httrack: httrack is a free and open-source website copier that allows you to download a website and browse it offline. It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies. To download a website, use the -O option followed by the directory where you want to save the website. For example, the following command will download the entire website located at http://example.com:

      httrack http://example.com -O /path/to/save/directory
      

      This will create a local copy of the website in the directory /path/to/save/directory/example.com.

    3. curl: curl is a command-line tool for transferring data from or to a server, using one of the supported protocols (HTTP, HTTPS, FTP, etc.). To download a website recursively, use the -r option. For example, the following command will download the entire website located at http://example.com:

      curl -k -L -O -J -R -H 'Referer: http://example.com' -e 'http://example.com' -A 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36' -m 1800 --retry 3 --retry-delay 5 --retry-max-time 60 --retry-connrefused -s -S -L -r -k -J -O -C - http://example.com
      

      This will create a local copy of the website in the current directory.

    These tools have different options and features, so you should choose the one that best suits your needs.

    Citations: [1] https://www.tutorialspoint.com/5-linux-command-line-based-tools-for-downloading-files-and-browsing-websites [2] https://www.kali.org/tools/httrack/ [3] https://medevel.com/os1-3-web-copier/ [4] http://www.linux-magazine.com/Online/Features/WebHTTrack-Website-Copier [5] https://winaero.com/make-offline-copy-of-a-site-with-wget-on-windows-and-linux/ [6] https://alvinalexander.com/linux-unix/how-to-make-offline-mirror-copy-website-with-wget

  • How to download a website for offline browsing using wget?
  • wget -mkEpnp

    wget --mirror --convert-links --adjust-extension --page-requisites --no-parent http://example.org

    Explanation of the various flags:

    --mirror – Makes (among other things) the download recursive.
    --convert-links – convert all the links (also to stuff like CSS stylesheets) to relative, so it will be suitable for offline viewing.
    --adjust-extension – Adds suitable extensions to filenames (html or css) depending on their content-type.
    --page-requisites – Download things like CSS style-sheets and images required to properly display the page offline.
    --no-parent – When recursing do not ascend to the parent directory. It useful for restricting the download to only a portion of the site.
    

    wget -mpHkKEb -t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' http://www.example.com

    –m (--mirror) : turn on options suitable for mirroring (infinite recursive download and timestamps).
    
    -p (--page-requisites) : download all files that are necessary to properly display a given HTML page. This includes such things as inlined images, sounds, and referenced stylesheets.
    
    -H (--span-hosts): enable spanning across hosts when doing recursive retrieving.
    
    –k (--convert-links) : after the download, convert the links in document for local viewing.
    
    -K (--backup-converted) : when converting a file, back up the original version with a .orig suffix. Affects the behavior of -N.
    
    -E (--adjust-extension) : add the proper extension to the end of the file.
    
    -b (--background) : go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log.
    
    -e (--execute) : execute command (robots=off).
    
    -t number (--tries=number) : set number of tries to number.
    
    -U (--user-agent) : identify as agent-string to the HTTP server. Some servers may ban you permanently for recursively download if you send the default User Agent.
    

    Cronjobs

    0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk -t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

    0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk -t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

    0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

    0 23 * * * cd ~/Documents/Webs/mirror; wget -mpkH t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' -D https://example.com

    0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

    0 23 * * * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

    0 23 * 1 * cd ~/Documents/Webs/mirror; wget -mpk t 1 -e robots=off -U 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:40.0) Gecko/20100101 Firefox/40.0' https://example.com

    0 8 * * * pkill wget; cd ~/Documents/Webs/mirror/; rm wget*

  • ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml

    How to download a website for offline browsing using wget?

    To download a website for offline browsing using wget, you can use the following command:

    bash wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains example.com --no-parent http://www.example.com

    Replace example.com with the website you want to download. This command will download the entire website, including all the elements that compose the page (images, CSS, etc.), and convert the links so that they work locally and offline[1].

    Here's a brief explanation of the options used in the command:

    • --recursive: Download the entire website recursively.
    • --no-clobber: Don't overwrite existing files.
    • --page-requisites: Download all the elements required to display the page properly (images, CSS, etc.).
    • --html-extension: Save files with the .html extension.
    • --convert-links: Convert links so that they work locally and offline.
    • --restrict-file-names=windows: Modify filenames so that they work in Windows as well.
    • --domains example.com: Don't follow links outside the specified domain.
    • --no-parent: Don't follow links outside the specified directory.

    If the website uses external resources like jQuery, you can try using the --span-hosts option to download resources from other domains as well:

    bash wget --recursive --no-clobber --page-requisites --html-extension --convert-links --restrict-file-names=windows --domains example.com --span-hosts --no-parent http://www.example.com

    However, this may also download additional content from other domains that you might not need. To avoid downloading unnecessary content, you can manually download the required resources (like jQuery) and update the HTML files to use the local copies instead.

    Keep in mind that some websites may have measures in place to prevent downloading their content using tools like wget. In such cases, you may need to adjust the command options or use alternative methods to download the website for offline browsing[6].

    Citations: [1] https://www.linuxjournal.com/content/downloading-entire-web-site-wget [2] https://winaero.com/make-offline-copy-of-a-site-with-wget-on-windows-and-linux/amp/ [3] https://stackoverflow.com/questions/10842263/wget-download-for-offline-viewing-including-absolute-references [4] https://askubuntu.com/questions/391622/download-a-whole-website-with-wget-or-other-including-all-its-downloadable-con [5] https://superuser.com/questions/970323/using-wget-to-copy-website-with-proper-layout-for-offline-browsing [6] https://www.computerhope.com/unix/wget.htm [7] https://superuser.com/questions/1672776/download-whole-website-wget [8] https://gist.github.com/stvhwrd/985dedbe1d3329e68d70 [9] https://simpleit.rocks/linux/how-to-download-a-website-with-wget-the-right-way/ [10] https://www.guyrutenberg.com/2014/05/02/make-offline-mirror-of-a-site-using-wget/ [11] https://linuxreviews.org/Wget:_download_whole_or_parts_of_websites_with_ease [12] https://brain-dump.space/articles/how-to-get-full-offline-website-copy-using-wget-on-mac-os/ [13] https://dev.to/jjokah/how-to-download-an-entire-website-for-offline-usage-using-wget-2lli [14] https://alvinalexander.com/linux-unix/how-to-make-offline-mirror-copy-website-with-wget [15] https://askubuntu.com/questions/979655/using-wget-and-having-websites-working-properly-offline

    7
    limit python requests to 1.39 requests per second
  • To achieve a rate of 1.39 requests per second, you can use a global variable to keep track of the time elapsed between requests and then calculate the delay based on that time. Here's a modified version of your fetch_github_data function that implements this:

    import time
    import requests
    import logging
    
    PERSONAL_ACCESS_TOKEN = "your_personal_access_token"
    DELAY = 1 / 1.39  # Calculate the delay for 1.39 requests per second
    last_request_time = 0  # Initialize a global variable to store the last request time
    
    
    def fetch_github_data(url):
        global last_request_time  # Access the global variable
    
        try:
            headers = {
                "Accept": "application/vnd.github+json",
                "Authorization": f"Bearer {PERSONAL_ACCESS_TOKEN}",
                "X-GitHub-Api-Version": "2022-11-28",
            }
    
            # Calculate the time elapsed since the last request
            time_elapsed = time.time() - last_request_time
    
            # Calculate the required delay based on the time elapsed
            required_delay = max(0, DELAY - time_elapsed)
    
            # Sleep for the required delay
            time.sleep(required_delay)
    
            response = requests.get(url, headers=headers)
    
            # Update the last request time
            last_request_time = time.time()
    
            logging.info(f"Fetched data from {url}")
            return response.json()
        except requests.exceptions.RequestException as e:
            logging.exception(f"Error fetching data from {url}\n{e}")
            raise
    

    This code calculates the required delay based on the desired rate of 1.39 requests per second and the time elapsed since the last request. It then sleeps for the required delay before making the next request. The global variable last_request_time is used to keep track of the time of the last request.

    Citations: [1] https://www.geeksforgeeks.org/how-to-add-time-delay-in-python/ [2] https://stackoverflow.com/questions/66229987/calculate-attempts-per-second [3] https://pypi.org/project/requests-ratelimiter/ [4] https://akshayranganath.github.io/Rate-Limiting-With-Python/ [5] https://stackoverflow.com/questions/32815451/are-global-variables-thread-safe-in-flask-how-do-i-share-data-between-requests [6] https://stackoverflow.com/questions/44014718/python-request-get-after-few-seconds [7] https://realpython.com/python-sleep/ [8] https://algotrading101.com/learn/yahoo-finance-api-guide/ [9] https://stackoverflow.com/questions/26098711/limiting-number-of-http-requests-per-second-on-python [10] https://realpython.com/python-use-global-variable-in-function/ [11] https://scrapeops.io/python-scrapy-playbook/scrapy-delay-between-requests/ [12] https://cloud.google.com/python/docs/reference/storage/1.44.0/client [13] https://github.com/JWCook/requests-ratelimiter [14] https://discuss.python.org/t/global-variables-shared-across-modules/16833 [15] https://coderslegacy.com/python/delay-between-requests-in-scrapy/ [16] https://jrnold.github.io/r4ds-exercise-solutions/transform.html [17] https://levelup.gitconnected.com/implement-rate-limiting-in-python-d4f86b09259f [18] https://docs.python.org/3/faq/programming.html [19] https://www.javatpoint.com/how-to-add-time-delay-in-python [20] https://koji.mbox.centos.org/koji/buildinfo?buildID=22406 [21] https://pypi.org/project/ratelimit/ [22] https://docs.python.org/3/library/timeit.html [23] https://www.purplefrogsystems.com/2020/07/how-to-delay-a-python-loop/ [24] https://medium.com/clover-platform-blog/conquering-api-rate-limiting-dcac5552714d [25] https://learning.postman.com/docs/writing-scripts/pre-request-scripts/ [26] https://python-forum.io/thread-35631.html [27] https://365datascience.com/tutorials/python-tutorials/limit-rate-requests-web-scraping/ [28] https://instructobit.com/tutorial/108/How-to-share-global-variables-between-files-in-Python [29] https://learn.microsoft.com/en-us/azure/storage/blobs/storage-blobs-latency [30] https://dev.to/astagi/rate-limiting-using-python-and-redis-58gk [31] https://www.simplilearn.com/tutorials/python-tutorial/global-variable-in-python

  • ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml

    limit python requests to 1.39 requests per second

    To limit your Python requests to 1.39 requests per second, you can use the time.sleep() function from the time module to introduce a delay between requests. You can also use third-party libraries like pyrate-limiter, requests-ratelimiter, or ratelimiter to achieve rate limiting.

    Here's an example using the time module:

    ```python import requests import time

    urls = [...] # List of URLs to send requests to

    for url in urls: response = requests.get(url) print(response.status_code) time.sleep(1 / 1.39) # Introduce a delay to achieve 1.39 requests per second ```

    Alternatively, you can use the pyrate-limiter library:

    ```python from pyrate_limiter import Duration, RequestRate, Limiter import requests

    rate_limit = RequestRate(1, Duration.SECOND / 1.39) # 1 request per 1.39 seconds limiter = Limiter(rate_limit)

    urls = [...] # List of URLs to send requests to

    for url in urls: with limiter: response = requests.get(url) print(response.status_code) ```

    Another option is to use the requests-ratelimiter library:

    ```python from requests import Session from requests_ratelimiter import LimiterAdapter

    session = Session()

    Apply a rate limit of 1.39 requests per second to all requests

    adapter = LimiterAdapter(per_second=1.39) session.mount('http://', adapter) session.mount('https://', adapter)

    urls = [...] # List of URLs to send requests to

    for url in urls: response = session.get(url) print(response.status_code) ```

    These examples demonstrate different ways to limit your Python requests to 1.39 requests per second. Choose the method that best suits your needs and project requirements.

    Citations: [1] https://stackoverflow.com/questions/26098711/limiting-number-of-http-requests-per-second-on-python [2] https://pypi.org/project/requests-ratelimiter/ [3] https://github.com/vutran1710/PyrateLimiter [4] https://akshayranganath.github.io/Rate-Limiting-With-Python/ [5] https://stackoverflow.com/questions/40748687/python-api-rate-limiting-how-to-limit-api-calls-globally [6] https://pypi.org/project/limiter/ [7] https://github.com/JWCook/requests-ratelimiter [8] https://levelup.gitconnected.com/implement-rate-limiting-in-python-d4f86b09259f [9] https://limits.readthedocs.io [10] https://github.com/encode/httpx/issues/815 [11] https://365datascience.com/tutorials/python-tutorials/limit-rate-requests-web-scraping/ [12] https://www.seelk.co/blog/efficient-client-side-handling-of-api-throttling-in-python-with-tenacity [13] https://www.cisco.com/c/en/us/support/docs/security/firepower-ngfw/217900-troubleshoot-firepower-threat-defense-an.html [14] https://scrapfly.io/blog/how-to-rate-limit-asynchronous-python-requests/ [15] https://dev.to/paymon123/the-easiest-way-to-rate-limit-a-python-api-3njc [16] https://cloud.google.com/python/docs/reference/storage/1.39.0/retry_timeout [17] https://medium.com/clover-platform-blog/conquering-api-rate-limiting-dcac5552714d [18] https://subscription.packtpub.com/book/web_development/9781838983994/9/ch09lvl1sec72/api-rate-limiting [19] https://towardsdatascience.com/speeding-up-python-code-fast-filtering-and-slow-loops-8e11a09a9c2f [20] https://katiekodes.com/python-wrap-requests-functions/ [21] https://www.reddit.com/r/Python/comments/12xahnb/i_built_a_simple_and_efficient_rate_limiter_for/ [22] https://docs.alcf.anl.gov/theta/performance-tools/craypat/ [23] https://coderpad.io/blog/development/a-guide-to-api-rate-limiting-in-django/

    4
    ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml

    How to avoid defining sqlite3 conn and cursor in many functions across the code?

    To avoid defining the sqlite3.Connection and sqlite3.Cursor in many functions throughout your code, you can use the following approaches:

    1. Use a context manager: Create a context manager that handles the connection and cursor creation, as well as closing the connection when done. This way, you can use the with statement to manage the connection and cursor in your functions.

    ```python import sqlite3

    DB_FILE = "your_database_file.db"

    class DatabaseConnection: def enter(self): self.conn = sqlite3.connect(DB_FILE) self.cursor = self.conn.cursor() return self.cursor

    def exit(self, exc_type, exc_val, exc_tb): self.conn.commit() self.cursor.close() self.conn.close()

    def insert_post_to_db(issue: Issue, lemmy_post_id: int) -> None: with DatabaseConnection() as cursor: cursor.execute( "INSERT INTO posts (issue_url, lemmy_post_id, issue_title, issue_body) VALUES (?, ?, ?, ?)", (issue.url, lemmy_post_id, issue.title, issue.formatted_body), ) ```

    1. Use a function decorator: You can create a decorator that handles the connection and cursor creation and passes the cursor to the decorated function.

    ```python import sqlite3 from functools import wraps

    DB_FILE = "your_database_file.db"

    def with_cursor(func): @wraps(func) def wrapper(*args, **kwargs): conn = sqlite3.connect(DB_FILE) cursor = conn.cursor()

    result = func(cursor, *args, **kwargs)

    conn.commit() cursor.close() conn.close()

    return result

    return wrapper

    @with_cursor def insert_post_to_db(cursor: sqlite3.Cursor, issue: Issue, lemmy_post_id: int) -> None: cursor.execute( "INSERT INTO posts (issue_url, lemmy_post_id, issue_title, issue_body) VALUES (?, ?, ?, ?)", (issue.url, lemmy_post_id, issue.title, issue.formatted_body), ) ```

    Both of these approaches help you avoid defining the connection and cursor in every function that interacts with the database. The context manager approach is more explicit and easier to read, while the function decorator approach is more concise. Choose the one that best fits your coding style and project requirements.

    Citations: [1] https://stackoverflow.com/questions/14511337/efficiency-of-reopening-sqlite-database-after-each-query [2] https://stackoverflow.com/questions/50075325/python-sqlite3-nested-cursor-execute [3] https://blog.udemy.com/python-sqlite/ [4] https://stackoverflow.com/questions/54395773/what-are-the-side-effects-of-reusing-a-sqlite3-cursor [5] https://pynative.com/python-sqlite/ [6] https://arctype.com/blog/guide-sqlite-python/ [7] https://sqlite.org/forum/info/4393a42b3b5e2382 [8] https://docs.python.org/3/library/sqlite3.html [9] https://www.reddit.com/r/learnpython/comments/94i4k9/using_a_global_sqlite_cursor_across_multiple/ [10] https://stackoverflow.com/questions/9561832/what-if-i-dont-close-the-database-connection-in-python-sqlite [11] https://climbtheladder.com/10-python-sqlite-best-practices/ [12] https://pypi.org/project/cuttlepool/ [13] https://www.sitepoint.com/sqlite-python/ [14] https://pyneng.readthedocs.io/en/latest/book/25_db/sqlite3.html [15] https://www.geeksforgeeks.org/python-sqlite-connecting-to-database/ [16] https://towardsdatascience.com/python-sqlite-tutorial-the-ultimate-guide-fdcb8d7a4f30 [17] https://codereview.stackexchange.com/questions/285730/simple-connection-pool-for-sqlite-in-python [18] https://developer.android.com/training/data-storage/sqlite [19] https://www.blog.pythonlibrary.org/2021/09/30/sqlite/ [20] https://www.digitalocean.com/community/tutorials/how-to-use-the-sqlite3-module-in-python-3 [21] https://developer.android.com/topic/performance/sqlite-performance-best-practices [22] https://www.reddit.com/r/learnpython/comments/8tkbor/how_does_sqlalchemy_connection_pooling_work_with/ [23] https://pymotw.com/2/sqlite3/ [24] https://vegibit.com/interact-with-databases-using-the-python-sqlite3-module/ [25] https://blog.rtwilson.com/a-python-sqlite3-context-manager-gotcha/ [26] https://remusao.github.io/posts/few-tips-sqlite-perf.html [27] https://www.digitalocean.com/community/tutorials/how-to-use-an-sqlite-database-in-a-flask-application [28] https://www.tutorialspoint.com/sqlite/sqlite_python.htm [29] https://www.sqlite.org/whentouse.html [30] https://rogerbinns.github.io/apsw/execution.html [31] https://stackoverflow.com/questions/42635749/sqlite-database-connection-best-practice [32] https://realpython.com/python-mysql/

    0

    Are AI-Generated Solutions Welcomed on programming.dev?

    I wanted to start a discussion about the use of AI-generated solutions on Programming.dev. Personally, I've found that AI-powered tools have been incredibly helpful in solving programming questions. I won't name any specific commercial software, but I use one that combines GPT-4 and web search to get more factual information. I write some answers I think I might revisit to the ShareGPT community, but I would prefer posting programming solutions to this instance. However, I'm not sure if AI-generated solutions are welcomed on programming.dev. I'd love to hear your thoughts on this. If AI-generated responses are accepted, how should we format the answers, should we just copy paste without quoting, should we quote the model, just mention that it's AI-generated,...?

    13
    [Solved] How would you debug this script without creating many posts?
  • Just change lemmy.post.create to lemmy.post.createe to trigger an AttributeError. That way you can debug the code without creating any posts. You can also use many print statements all around the code, I would use two for each line to make sure the computer isn't fooling you. Lastly, you can spin up your own Lemmy instance to not have to worry about the generated posts.

  • Lemmy Support @lemmy.ml InternetPirate @lemmy.fmhy.ml

    How to see the feed of another instance?

    I'm wondering if it's possible to see the local feed of another instance from the one I'm using. I'm interested in exploring content from other instances without having to visit every single community, but I'm not sure how to do it. I've tried searching for a way to do this on the documentation and using the Lemmy search, but I haven't found any clear instructions. Does anyone know how to see the local feed of another instance? Any help or guidance would be greatly appreciated!

    5
    Lemmy Support @lemmy.ml InternetPirate @lemmy.fmhy.ml

    Does commenting on your own post bump it on the active filter view in Lemmy?

    In Lemmy, the active filter view is designed to prioritize posts with the latest activity, similar to how forums work. However, it remains unclear whether commenting on your own post in Lemmy will bump it on the active filter view. Some forum platforms, such as Discourse, allow a practice known as the "ghost bump," where users can make a post and delete it to draw attention to their post without adding new content[^1]. While it is uncertain if this is possible on Lemmy, it's worth noting that even if it were, it would result in an unnecessary comment that cannot be completely removed. The comment would still be visible, indicating that it was deleted by the post's creator. If you have any experience with Lemmy's active filter view or know whether commenting on your own post bumps it, please share your thoughts in the comments below.

    [^1]: What is "Bumping Topics"

    2

    Seeking a Forgejo-hosted Mirror for Lemmy Outside the US

    As an enthusiastic supporter of Lemmy, I am eager to contribute to the project. However, I hold strong reservations about writing a single line of code for a project hosted on a Micro$oft server. While I have created a few issues on GitHub, I firmly believe that my contributions could be significantly amplified if there were a mirror of Lemmy that utilized Forgejo hosting outside the United States. I would be absolutely delighted to have the opportunity to contribute more actively to this incredible project if such an alternative hosting option were available.

    10

    (Rumored Leak of) GPT-4 Architecture, Infrastructure, Training Dataset, Costs, Vision, MoE (11.07.2023)

    threadreaderapp.com Thread by @Yampeleg on Thread Reader App

    @Yampeleg: GPT-4's details are leaked. It is over. Everything is here: twitter.com/i/web/status/1… Parameters count: GPT-4 is more than 10x the size of GPT-3. We believe it has a total of ~1.8 trillion parameters ac...…

    GPT-4's details are leaked.

    It is over.

    Everything is here: https://archive.is/2RQ8X

    Parameters count:

    GPT-4 is more than 10x the size of GPT-3. We believe it has a total of ~1.8 trillion parameters across 120 layers.

    Mixture Of Experts - Confirmed.

    OpenAI was able to keep costs reasonable by utilizing a mixture of experts (MoE) model. They utilizes 16 experts within their model, each is about ~111B parameters for MLP. 2 of these experts are routed to per forward pass.

    MoE Routing:

    While the literature talks a lot about advanced routing algorithms for choosing which experts to route each token to, OpenAI’s is allegedly quite simple, for the current GPT-4 model.

    There roughly ~55B shared parameters for attention.

    Inference:

    Each forward pass inference (generation of 1 token) only utilizes ~280B parameters and ~560 TFLOPs. This contrasts with the ~1.8 trillion parameters and ~3,700 TFLOP that would be required per forward pass of a purely dense model.

    Dataset:

    GPT-4 is trained on ~13T tokens.

    These are not unique tokens, they count the epochs as more tokens as well.

    Epoch number: 2 epochs for text-based data and 4 for code-based data.

    There is millions of rows of instruction fine-tuning data from ScaleAI & internally.

    GPT-4 32K

    There was an 8k context length (seqlen) for the pre-training phase. The 32k seqlen version of GPT-4 is based on fine-tuning of the 8k after the pre-training.

    Batch Size:

    The batch size was gradually ramped up over a number of days on the cluster, but by the end, OpenAI was using a batch size of 60 million! This, of course, is “only” a batch size of 7.5 million tokens per expert due to not every expert seeing all tokens.

    For the real batch size:

    Divide this number by the seq len to get the real batch size. just stop with this misleading numbers already.

    Parallelism Strategies

    To parallelize across all their A100s GPUs They utilized 8-way tensor parallelism as that is the limit for NVLink.

    Beyond that, they are using 15-way pipeline parallelism.

    (likely used ZeRo Stage 1. It is possible they used block-level FSDP)

    Training Cost

    OpenAI’s training FLOPS for GPT-4 is ~2.15e25, on ~25,000 A100s for 90 to 100 days at about 32% to 36% MFU.

    Part of this extremely low utilization is due to an absurd number of failures requiring checkpoints that needed to be restarted from.

    If their cost in the cloud was about $1 per A100 hour, the training costs for this run alone would be about $63 million.

    (Today, the pre-training could be done with ~8,192 H100 in ~55 days for $21.5 million at $2 per H100 hour.)

    Mixture of Expert Tradeoffs

    There are multiple MoE tradeoffs taken: For example, MoE is incredibly difficult to deal with on inference because not every part of the model is utilized on every token generation.

    This means parts may sit dormant when other parts are being used. When serving users, this really hurts utilization rates.

    Researchers have shown that using 64 to 128 experts achieves better loss than 16 experts, but that’s purely research.

    There are multiple reasons to go with fewer experts. One reason for OpenAI choosing 16 experts is because more experts are difficult to generalize at many tasks. More experts can also be more difficult to achieve convergence with.

    With such a large training run, OpenAI instead chose to be more conservative on the number of experts.

    GPT-4 Inference Cost

    GPT-4 costs 3x that of the 175B parameter Davincci.

    This is largely due to the larger clusters required for GPT-4 and much lower utilization achieved.

    AN estimate of it's costs is $0.0049 cents per 1k tokens for 128 A100s to inference GPT-4 8k seqlen and $0.0021 cents per 1k tokens for 128 H100’s to inference GPT-4 8k seqlen. It should be noted, we assume decent high utilization, and keeping batch sizes high.

    Multi-Query Attention

    OpenAI are using MQA just like everybody else.

    Because of that only 1 head is needed and memory capacity can be significantly reduced for the KV cache. Even then, the 32k seqlen GPT-4 definitely cannot run on 40GB A100s, and the 8k is capped on max bsz.

    Continuous batching

    OpenAI implements both variable batch sizes and continuous batching. This is so as to allow some level of maximum latency as well optimizing the inference costs.

    Vision Multi-Modal

    It is a separate vision encoder from the text encoder, with cross-attention. The architecture is similar to Flamingo. This adds more parameters on top of the 1.8T of GPT-4. It is fine-tuned with another ~2 trillion tokens, after the text only pre-training.

    On the vision model, OpenAI wanted to train it from scratch, but it wasn’t mature enough, so they wanted to derisk it by starting with text.

    One of the primary purposes of this vision capability is for autonomous agents able to read web pages and transcribe what’s in images and video.

    Some of the data they train on is joint data (rendered LaTeX/text), screen shots of web page, youtube videos: sampling frames, and run Whisper around it to get transcript.

    [Dont want to say "I told you so" but..]

    Speculative Decoding

    OpenAI might be using speculative decoding on GPT-4's inference. (not sure 100%)

    The idea is to use a smaller faster model to decode several tokens in advance, and then feeds them into a large oracle model as a single batch.

    If the small model was right about its predictions – the larger model agrees and we can decode several tokens in a single batch.

    But if the larger model rejects the tokens predicted by the draft model then the rest of the batch is discarded. And we continue with the larger model.

    The conspiracy theory that the new GPT-4 quality had been deteriorated might be simply because they are letting the oracle model accept lower probability sequences from the speculative decoding model.

    Inference Architecture

    The inference runs on a cluster of 128 GPUs.

    There are multiple of these clusters in multiple datacenters in different locations.

    It is done in 8-way tensor parallelism and 16-way pipeline parallelism.

    Each node of 8 GPUs has only ~130B parameters, or… twitter.com/i/web/status/1…

    The model has 120, so it fits in 15 different nodes. [Possibly the there are less layers on the first node since it needs to also compute the embeddings]

    According to these numbers: OpenAI should have trained on 2x the tokens if they were trying to go by chinchilla's optimal.

    [let alone surpass it like we do]

    This goes to show that they are struggling to get high quality data. Why no FSDP?

    A possible reason for this could be that some of the hardware infra they secured is of an older generation.

    This is pretty common at local compute clusters as the organisation usually upgrade the infra in several "waves" to avoid a complete pause of operation.… twitter.com/i/web/status/1…

    Dataset Mixture

    They trained on 13T tokens.

    CommonCrawl & RefinedWeb are both 5T.

    Remove the duplication of tokens from multiple epochs and we get to a much reasonable number of "unaccounted for" tokens: The "secret" data.

    Which by this point we already get rumors that parts of it came from twitter, reddit & youtube.

    [Rumors that start to become lawsuits]

    Some speculations are:

    • LibGen (4M+ books)
    • Sci-Hub (80M+ papers)
    • All of GitHub

    My own opinion:

    The missing dataset it a custom dataset of college textbooks collected by hand for as much courses as possible.

    This is very easy to convert to txt file and than with self-instruct into instruction form.

    This creates the "illusion" that GPT-4 "is smart" no matter who use it.

    Computer scientist? sure! it can help you with your questions about P!=NP

    Philosophy major? It can totally talk to you about epistemology.

    Don't you see?

    It was trained on the textbooks. It is so obvious.

    There are also papers that try to extract by force memorized parts of books from GPT-4 to understand what it trained on.

    There are some books it knows so well that it had seen them for sure.

    Moreover, If i remember correctly: It even know the unique ids of project Euler exes.

    0
    Focused Transformer: Contrastive Training for Context Scaling - 256k context length AI
  • The paper actually demonstrates a 16-million context window with 92% accuracy. Most models can be retrained to have a 100k context window with over 92% accuracy, but the accuracy drops to 74% at 256k. The code has already been released on GitHub as well. I'm excited to see the development of 100k models using this method soon!

  • Focused Transformer: Contrastive Training for Context Scaling - 256k context length AI

    Summary:

    > Focused Transformer: A new technique for long-context language modeling. The paper introduces Focused Transformer (FOT), a method that uses contrastive learning and external memory to improve the structure of the (key, value) space and extend the context length of transformer models. FOT can fine-tune existing large models without changing their architecture and achieve better performance on tasks that require long context. > > LONGLLAMA: Extending LLaMA’s context length with FOT. The paper demonstrates the application of FOT to fine-tune OpenLLaMA models, which are large language models with memory augmentation. The resulting models, called LONGLLAMAs, can handle a context length of up to 256k tokens and show improvements on few-shot learning tasks such as TREC and WebQS. > > Distraction issue: A key challenge for scaling context length. The paper identifies the distraction issue as a major obstacle for using large memory databases in multi-document scenarios. The distraction issue occurs when keys from irrelevant documents overlap with keys from relevant ones, making them hard to distinguish. FOT alleviates this issue by exposing the memory attention layer to both positive and negative examples during training.

    ELI5

    > Sure! Imagine you have a toy box with lots of toys inside. You want to find your favorite toy, but there are so many toys that it's hard to find it. The Focused Transformer is like a special helper that can look inside the toy box and find your favorite toy quickly, even if there are lots of other toys in the way. It does this by remembering which toys are important and which ones are not, so it can find the right toy faster. Does that make sense?

    Implications

    > The Focused Transformer (FOT) technique has the potential to improve the performance of language models by extending their context length. This means that the models can better understand and incorporate new information, even when it is spread across a large number of documents. The resulting LONGLLAMA models show significant improvements on tasks that require long-context modeling, such as retrieving information from large databases. This research could have implications for natural language processing, code generation, quantitative reasoning, and theorem proving, among other areas. It could also make it easier to fine-tune existing large-scale models to lengthen their effective context. Is there anything else you would like to know?

    4

    Where can I seek medical advice online?

    Recently, I found myself questioning the accuracy of a diagnosis provided by a doctor I visited. Surprisingly, an AI seemed to offer a more insightful assessment. However, I understand the importance of not solely relying on AI-generated information. With that in mind, I'm eager to discover a reputable online platform where I can seek medical advice. Ideally, I hope to find a community where I can obtain multiple opinions to make a more informed decision about my health. If anyone could recommend such a site, I would greatly appreciate it.

    10
    ShareGPT @lemmy.fmhy.ml InternetPirate @lemmy.fmhy.ml

    Price of insulin in China, Cuba, Laos, North Korea, and Vietnam?

    Information on the price of insulin in China, Cuba, Laos, North Korea, and Vietnam is not readily available in the search results. However, there is information on the cost of insulin in some other countries. According to the World Population Review, the cost of insulin varies widely by country, with the United States having the highest prices[1]. In 2018, RAND Corporation reported that the manufacturer price for any given type of insulin averaged five to ten times higher in the United States than in all other OECD countries[1]. Novo Nordisk, a major insulin manufacturer, has recently slashed the price of insulin in the United States[3].

    A study published in PMC in 2019 examined the prices, availability, and affordability of insulin in 13 low-income and middle-income countries. The study found that median procurement prices of human insulins across the countries ranged from $4.56 to $5.30, with large price variations seen across the countries[4]. The highest prices were found in two Chinese provinces, with isophane human insulin costing $24.72 in Shaanxi, China[4].

    There is also information on the cost of insulin in South Korea, which stands at an average of $10.30 per unit[2]. According to Volza's Global Import data, insulin syringe import shipments from South Korea stood at 1.6K, imported by 98 World Importers from 82 South Korea[5]. China is one of the top exporters of insulin syringes, with 2,155 shipments[6].

    Citations:

    [1] https://worldpopulationreview.com/country-rankings/cost-of-insulin-by-country

    [2] https://wisevoter.com/country-rankings/cost-of-insulin-by-country/

    [3] https://www.fiercepharma.com/pharma/novo-nordisk-follows-eli-lillys-lead-slashing-insulin-prices-us

    [4] https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6570978/

    [5] https://www.volza.com/imports-global/global-import-data-of-insulin+syringe-from-south-korea

    [6] https://www.volza.com/p/syringe-insulin/export/export-from-china/

    0

    French Courts Are Giving Protesters INSANE Sentences

    French courts have been imposing disproportionately severe sentences for minor offenses, including 10 months in prison for stealing a can of Red Bull and one year for a homeless boy with schizophrenia caught looting a luxury store. The overwhelmed courts rush cases, provide minimal time for defendants, and prioritize punishment under the instruction of the Justice Minister. Furthermore, the French government is censoring social media and justifying it by claiming to protect public order, but it infringes upon free speech and mirrors tactics used by authoritarian regimes. The justice system exhibits a double standard, favoring the privileged, and creates a class divide, leading to unrest. Ironically, the government compares itself to oppressive nations while undermining democratic principles.

    14

    French Courts Are Giving Protesters INSANE Sentences

    French courts have been imposing disproportionately severe sentences for minor offenses, including 10 months in prison for stealing a can of Red Bull and one year for a homeless boy with schizophrenia caught looting a luxury store. The overwhelmed courts rush cases, provide minimal time for defendants, and prioritize punishment under the instruction of the Justice Minister. Furthermore, the French government is censoring social media and justifying it by claiming to protect public order, but it infringes upon free speech and mirrors tactics used by authoritarian regimes. The justice system exhibits a double standard, favoring the privileged, and creates a class divide, leading to unrest. Ironically, the government compares itself to oppressive nations while undermining democratic principles.

    36

    'Deepfake' Kanye Video Warns of Disinformation and Civil Unrest: 'AI Will Kill The Media Industry' (Yahoo article 05.07.2023)

    finance.yahoo.com 'Deepfake' Kanye Video Warns of Disinformation and Civil Unrest: 'AI Will Kill The Media Industry'

    A controversial short film is raising eyebrows and questions about the use of "deepfakes" and generative AI. Created in just 7 days on a budget of only $30, the AI-generated video set to the song "War With The Matrix" is a radical prediction of the impact of AI on media and society by acclaimed hip-...

    [YANDHI - WAR WITH THE MATRIX (KANYE AI X BIG BABY GANDHI)](https://youtube.com/watch?v=CGyPqImBOjY

    0
    PSA: Lemmy votes can be manipulated
  • I feel like this is what happened when you’d see posts with hundreds / thousands of upvotes but had only 20-ish comments.

    Nah it's the same here in Lemmy. It's because the algorithm only accounts for votes and not for user engagement.

  • What would you do if you had access to a superintelligent AGI?
  • Locked in a room with an internet connection? A lot. But without any contact with the outside world? Not nearly as much. It could have other people running experiments for it with an internet connection, but not without one.

    Anyway, whether or not the AGI can interact with the real world undermines the purpose of my explicit statement in the question. I specifically mentioned that it only operates as a human on a computer. I didn't mention it could acquire a physical body, so let's just assume it can't and can't use other people to do physical labor either.

  • What would you do if you had access to a superintelligent AGI?
  • I heard disruptive science is slowing down which I think means pretty much everything possible has already been thought of. So talking about things that exist, do you mean a cheaper solar panel or wind/water turbine? Or are we talking about science fiction like an Arc Reactor?

  • What would you do if you had access to a superintelligent AGI?

    Imagine an AGI (Artificial General Intelligence) that could perform any task a human can do on a computer, but at a much faster pace. This AGI could create an operating system, produce a movie better than anything you've ever seen, and much more, all while being limited to SFW (Safe For Work) content. What are the first things you would ask it to do?

    60